← All creators
N

Nathan Lambert

37 recommendations

Showing 25 of 37 recommendations

Clear filters
Grok 4 Heavy mentions software
"Although when Grok 4 came out, the Grok 4 SuperGrok Heavy, which was like their pro variant was actually very good and I was pretty impressed with it."
Claude Opus 4.5 mentions software
"The hype over Anthropic's Claude Opus 4.5 model has been absolutely insane, which is just... I mean, I've used it and built stuff in the last few weeks, and it's... it's almost gotten to the point where it feels like a bit of a meme in terms of the hype."
ChatGPT mentions software
"ChatGPT kicked off a movement in the US where everything had a chatbot."
Z.ai GLM models mentions software
"The likes of Z.ai with their GLM models, Minimax's models, Kimi Moonshot, especially in the last few months, has shown more brightly."
Minimax mentions software
"The likes of Z.ai with their GLM models, Minimax's models, Kimi Moonshot, especially in the last few months, has shown more brightly."
Kimi Moonshot mentions software
"The likes of Z.ai with their GLM models, Minimax's models, Kimi Moonshot, especially in the last few months, has shown more brightly."
GPT-5 mentions software
"Personally, I have very mixed reviews of GPT-5, but it must have saved them so much money with the high-line feature being a router where most users are no longer charging their GPU costs as much."
Google TPU mentions product
"Largely because the margin on NVIDIA chips is insane, and Google can develop everything from top to bottom to fit their stack and not have to pay this margin."
NVIDIA chips mentions product
"Largely because the margin on NVIDIA chips is insane, and Google can develop everything from top to bottom to fit their stack and not have to pay this margin."
Deep Research mentions software
"Like Deep Research, Sora, o1 thinking models—all these definitional things have come from OpenAI."
Sora mentions software
"Like Deep Research, Sora, o1 thinking models—all these definitional things have come from OpenAI."
o1 thinking models mentions software
"Like Deep Research, Sora, o1 thinking models—all these definitional things have come from OpenAI."
Hugging Face mentions software
"On my blog, we scrape Hugging Face so we keep download numbers for every dataset and model over time, so we have them."
Qwen mentions software
"Qwen might be the one— Oh, yeah. Qwen was the obvious name I was gonna say."
GPT-2 mentions software
"When I was writing about OpenAI's open model release, they were like, 'Don't forget about GPT-2,' which I thought was really funny 'cause it's just such a different time."
SmolLM mentions software
"Hugging Face has SmolLM, which is very popular."
OpenRouter mentions software
"With OpenRouter, it's easy to look at multi-model things. You can run DeepSeek on Perplexity."
DeepSeek R1 mentions software
"And then DeepSeek are the people that did the training breakthrough, which is, they scaled the reinforcement learning."
Constitutional AI mentions technique
"That's the older term for it coined in Anthropic's Constitutional AI paper."
OpenAI o1 mentions software
"I think you can kind of take this in order. I think you could view it as what made o1, which is this first reasoning model, possible, or what will the latest model be?"
GRPO mentions technique
"If we look at the GRPO equation, this one is famous for this because essentially the reward given to the agent is based on how good a given action—an action is a completion—is relative to the other answers to that same problem."
Scale-RL mentions technique
"I think there's a seminal paper from a Meta internship. It's called something like 'The Art of Scaling Reinforcement Learning with Language Models.' What they describe as a framework is Scale-RL."
Direct Preference Optimization mentions technique
"the famous paper, Direct Preference Optimization, which is a much simpler way of solving the problem than RL. The derivations in the appendix skip steps of math."
Claude mentions software
"But if you go from a small university with no compute and find something that Claude struggles with, and then the next Claude model has it in the blog post, there's your career rocket ship."
GPT-5.2 Pro mentions software
"If you think about GPT-5.2 Pro taking an hour, it's like, what if your training run has a sample for an hour and you have to make sure that's handled efficiently?"