Mentioned by
All mentions
"I can give you also a hands-on example. I was training the Qwen 3 base model with RLVR on MATH-500. The base model had an accuracy of about 15%. Just 50 steps, like in a few minutes with RLVR, the model went from 15% to 50% accuracy."
From:
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
•
▶ 1:44:16
•
Jan 2026
Attribution: Sebastian describes hands-on experience training Qwen 3 with RLVR, demonstrating practical use