METR

other Visit website →

Research organization that evaluates AI model capabilities by measuring how much human work AI can do before requiring intervention.

Topics

ai evaluation research benchmarking

Casual references without a clear endorsement

Scott Wu mentioned "One of the stats that people talk about a lot is this METR report which basically says for each d..." ▶ 1:16