← Back
MMLU
other
Massive Multitask Language Understanding - a benchmark for evaluating AI models across diverse academic subjects and knowledge areas.
Topics
Also mentioned
(1)
Casual references without a clear endorsement
Sebastian Raschka
mentioned
"even something simpler like MMLU, which is a multiple-choice benchmark. If you just change the fo..."
▶ 1:46:50