18 research outputs found
Humanity's Last Exam
Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 3,000 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai
Humanity's Last Exam
Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 3,000 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai
The extraordinary March 2022 East Antarctica "heat" wave. Part I: observations and meteorological drivers
peer reviewe
The extraordinary March 2022 East Antarctica "heat" wave. Part II: impacts on the Antarctic ice sheet
peer reviewe
Acoustic Mist Ionization-Mass Spectrometry: A Comparison to Conventional High-Throughput Screening and Compound Profiling Platforms
Architectural element analysis of paleo submarine fan systems, Hornby and Denman Islands, BC
PosterThe basin sediments of the Nanaimo Group comprise the majority of the bedrock in the Gulf islands, and provide an extensive record of ancient depositional environments. While the Nanaimo Group sediments are traditionally considered lithostratigraphically, consisting of clear well-defined boundaries and sequencing of formations, it was observed in the field that these units vary considerably in lateral extent and continuity. While facies models are traditionally applied to sedimentary deposits to interpret the environment of paleo-deposition, we find that the current lithostratigraphic log oversimplifies the depositional environment
and does not represent the complex and three-dimensional architecture of the local deposits, limiting the functionality of facies models alone for depositional reconstruction. Instead, architectural analysis of depositional elements, in conjunction with parsed and reorganized facies studies, allows a more realistic perspective of the basin characteristics in which the Nanaimo Group sediments were deposited.https://viurrspace.ca/bitstream/handle/10613/11798/GEOL470Poster.pdf?sequence=3Geology 470 - Special Topics in Earth Scienc
The extraordinary March 2022 East Antarctica “heat” wave. Part I: observations and meteorological drivers
International audienceBetween March 15-19, 2022, East Antarctica experienced an exceptional heatwave with widespread 30-40° C temperature anomalies across the ice sheet. This record-shattering event saw numerous monthly temperature records being broken including a new all-time temperature record of -9.4° C on March 18 at Concordia Station despite March typically being a transition month to the Antarctic coreless winter. The driver for these temperature extremes was an intense atmospheric river advecting subtropical/mid-latitude heat and moisture deep into the Antarctic interior. The scope of the temperature records spurred a large, diverse collaborative effort to study the heatwaves meteorological drivers, impacts, and historical climate context. Here we focus on describing those temperature records along with the intricate meteorological drivers that led to the most intense atmospheric river observed over East Antarctica. These efforts describe the Rossby wave activity forced from intense tropical convection over the Indian Ocean. This led to an atmospheric river and warm conveyor belt intensification near the coastline which reinforced atmospheric blocking deep into East Antarctica. The resulting moisture flux and upper-level warm air advection eroded the typical surface temperature inversions over the ice sheet. At the peak of the heatwave, an area of 3.3 million km2 in East Antarctica exceeded previous March monthly temperature records. Despite a temperature anomaly return time of about one hundred years, a closer recurrence of such an event is possible under future climate projections. In a subsequent manuscript, we describe the various impacts this extreme event had on the East Antarctic cryosphere
The extraordinary March 2022 East Antarctica “heat” wave. Part II: impacts on the Antarctic ice sheet
Between March 15-19, 2022, East Antarctica experienced an exceptional heatwave with widespread 30-40° C temperature anomalies across the ice sheet. In Part I, we assessed the meteorological drivers that generated an intense atmospheric river (AR) which caused these record-shattering temperature anomalies. Here in Part II, we continue our large, collaborative study by analyzing the widespread and diverse impacts driven by the AR landfall.
These impacts included widespread rain and surface melt which was recorded along coastal areas, but this was outweighed by widespread, high snowfall accumulations resulting in a largely positive surface mass balance contribution to the East Antarctic region. An analysis of the surface energy budget indicated that widespread downward longwave radiation anomalies caused by large cloud-liquid water contents along with some scattered solar radiation produced intense surface warming. Isotope measurements of the moisture were highly elevated, likely imprinting a strong signal for past climate reconstructions. The AR event attenuated cosmic ray measurements at Concordia, something previously never observed. Finally, an extratropical cyclone west of the AR landfall likely triggered the final collapse of the critically unstable Conger Ice Shelf while further reducing an already record low sea-ice extent
The extraordinary March 2022 East Antarctica “heat” wave. Part I: observations and meteorological drivers
Between March 15-19, 2022, East Antarctica experienced an exceptional heatwave with widespread 30-40° C temperature anomalies across the ice sheet. This record-shattering event saw numerous monthly temperature records being broken including a new all-time temperature record of -9.4° C on March 18 at Concordia Station despite March typically being a transition month to the Antarctic coreless winter. The driver for these temperature extremes was an intense atmospheric river advecting subtropical/mid-latitude heat and moisture deep into the Antarctic interior. The scope of the temperature records spurred a large, diverse collaborative effort to study the heatwave’s meteorological drivers, impacts, and historical climate context.
Here we focus on describing those temperature records along with the intricate meteorological drivers that led to the most intense atmospheric river observed over East Antarctica. These efforts describe the Rossby wave activity forced from intense tropical convection over the Indian Ocean. This led to an atmospheric river and warm conveyor belt intensification near the coastline which reinforced atmospheric blocking deep into East Antarctica. The resulting moisture flux and upper-level warm air advection eroded the typical surface temperature inversions over the ice sheet. At the peak of the heatwave, an area of 3.3 million km2 in East Antarctica exceeded previous March monthly temperature records. Despite a temperature anomaly return time of about one hundred years, a closer recurrence of such an event is possible under future climate projections. In a subsequent manuscript, we describe the various impacts this extreme event had on the East Antarctic cryosphere
