← Back

Research organization that evaluates AI model capabilities by measuring how much human work AI can do before requiring intervention.

Also mentioned (1)

Casual references without a clear endorsement

Scott Wu mentioned "One of the stats that people talk about a lot is this METR report which basically says for each d..." ▶ 1:16