← Back

RLHF Book

media 1 mention from 1 sources Visit website →

Nathan Lambert's book on Reinforcement Learning from Human Feedback and post-training techniques for language models.

Check price →
1

sources

Mentioned by

All mentions

Nathan Lambert created this ✓ High confidence
"I think a lot of what I was trying to do in this RLHF book is take post-training techniques and describe how people think about them influencing the model and what people are doing."

Attribution: Nathan refers to 'this RLHF book' as his own work, describing his goals and approach