Sebastian Raschka

69 recommendations

Everything Sebastian personally uses, recommends, or has created — plus things they don't recommend — sourced from their own show and appearances on other podcasts.

Created by Sebastian

MLxtend created

Build a Large Language Model from Scratch created

ADAM Project created

Top picks

Claude Code

"I try Claude Code on the web every three to six months, which is just prompting a model to make an update to some GitHub repository that I have"

VS Code

"So, I use the Codeium plugin for VS Code."

Nvidia

"Even back when I was a grad student, I was in a lab doing biophysical simulations, molecular dynamics, and we had a Tesla GPU back then just for the computations. It was about 15 years ago now."

Codeium

"So, I use the Codeium plugin for VS Code. You know, it's very convenient. It's just like a plugin, and then it's a chat interface that has access to your repository."

OLMo

"What I would recommend doing, or what I also do, is if I want to understand, for example, how OLMo is implemented, I would look at the weights in the model hub, the config file, and then you can see, 'Oh, they used so many layers.'"

Qwen 3

"I can give you also a hands-on example. I was training the Qwen 3 base model with RLVR on MATH-500. The base model had an accuracy of about 15%. Just 50 steps, like in a few minutes with RLVR, the model went from 15% to 50% accuracy."

OLMo 3

"Sometimes it takes me a day. With OLMo 3, the challenge was RoPE for the position embeddings. They had a YaRN extension and there was some custom scaling there, and I couldn't quite match these things."

Exa

"Exa is my preferred search provider"

All products

Recent episodes

All episodes →

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

Jan 31, 2026 · on Lex Fridman guest · 27 recs

Show: All 69 | uses recommends mentions

All recommendations

ChatGPT Pulse uses software

"I used that feature before, and I always feel bad because it does that every day, and I rarely check it out"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:21:02 • Jan 2026

Codeium uses software

"So, I use the Codeium plugin for VS Code. You know, it's very convenient. It's just like a plugin, and then it's a chat interface that has access to your repository."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 21:59 • Jan 2026

VS Code uses software

"So, I use the Codeium plugin for VS Code."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 21:59 • Jan 2026

OLMo uses software

"What I would recommend doing, or what I also do, is if I want to understand, for example, how OLMo is implemented, I would look at the weights in the model hub, the config file, and then you can see, 'Oh, they used so many layers.'"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:01:42 • Jan 2026

Nvidia uses product

"Even back when I was a grad student, I was in a lab doing biophysical simulations, molecular dynamics, and we had a Tesla GPU back then just for the computations. It was about 15 years ago now."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 4:00:49 • Jan 2026

Cursor Composer uses software

"I should say I use Composer a lot because one of the benefits it has is that it's fast"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:39:38 • Jan 2026

OLMo 3 uses software

"Sometimes it takes me a day. With OLMo 3, the challenge was RoPE for the position embeddings. They had a YaRN extension and there was some custom scaling there, and I couldn't quite match these things."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:02:21 • Jan 2026

Metroid uses other

"Sometimes for pastime I play video games, like I like- Video games with puzzles, like Zelda and Metroid."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:11:34 • Jan 2026

MATH-500 uses other

"I was training the Qwen 3 base model with RLVR on MATH-500. The base model had an accuracy of about 15%."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:44:20 • Jan 2026

Exa uses software

"Exa is my preferred search provider"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:37:20 • Jan 2026

Claude Code uses software

"I try Claude Code on the web every three to six months, which is just prompting a model to make an update to some GitHub repository that I have"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:37:55 • Jan 2026

Tesla GPU uses product

"We had a Tesla GPU back then just for the computations. It was about 15 years ago now"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 4:00:53 • Jan 2026

Qwen 3 uses software

"I can give you also a hands-on example. I was training the Qwen 3 base model with RLVR on MATH-500. The base model had an accuracy of about 15%. Just 50 steps, like in a few minutes with RLVR, the model went from 15% to 50% accuracy."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:44:16 • Jan 2026

Zelda uses other

"Sometimes for pastime I play video games, like I like- Video games with puzzles, like Zelda and Metroid."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:11:30 • Jan 2026

ChatGPT recommends software

"So I suggested, 'Hey, let's try ChatGPT.' We copied the text into ChatGPT, and it fixed them. Instead of two hours going from link to link fixing that, it made that type of work much more seamless."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:33:29 • Jan 2026

Recursive Language Model recommends technique

"The Recursive Language Model paper, that is one of the papers that tries to kind of address the long context thing"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:46:45 • Jan 2026

Mistral AI mentions software

"Let's throw in Mistral AI, Gemma..."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 29:01 • Jan 2026

gpt-oss-120b mentions software

"gpt-oss, the open weight model by OpenAI... gpt-oss-120b is actually a very strong model and does some things that other models don't do very well."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 29:05 • Jan 2026

NVIDIA Nemotron 3 mentions software

"Actually, NVIDIA had a really cool one, Nemotron 3."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 29:05 • Jan 2026

GPT-2 mentions software

"And then you start, let's say, with your GPT-2 model and add these things."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:02:01 • Jan 2026

Bing Sydney mentions software

"I would love to have tried Bing Sydney. Did that have more voice? Because it would so often go off the rails, which is historically obviously a scary way—like telling a reporter to leave his wife—is a crazy model to potentially put in general adoption."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:24:07 • Jan 2026

GPT-4o mentions software

"There was a lot of backlash last year with GPT-4o getting removed, and I've personally never used the model, but I've talked to people at OpenAI where they get emails from users that might be detecting subtle differences in the deployments in the middle of the night."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:24:35 • Jan 2026

MMLU mentions other

"even something simpler like MMLU, which is a multiple-choice benchmark. If you just change the format slightly, like, I don't know, if you use a dot instead of a parenthesis or something like that, the model accuracy will vastly differ."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:46:50 • Jan 2026

Hugging Face Transformers mentions software

"When you code these from scratch, you can take an existing model from the Hugging Face Transformers library. The library is great, but if you want to learn about LLMs, it's not the best place to start because the code is so complex to fit so many use cases."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:00:16 • Jan 2026

SGLang mentions software

"even Transformers, the library, is not used in production. People use SGLang or vLLM, and it adds another layer of complexity."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:01:07 • Jan 2026

vLLM mentions software

"even Transformers, the library, is not used in production. People use SGLang or vLLM, and it adds another layer of complexity."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:01:11 • Jan 2026

RoPE mentions technique

"With OLMo 3, the challenge was RoPE for the position embeddings. They had a YaRN extension and there was some custom scaling there, and I couldn't quite match these things."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:02:25 • Jan 2026

YaRN mentions technique

"They had a YaRN extension and there was some custom scaling there, and I couldn't quite match these things."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:02:29 • Jan 2026

LoRA mentions technique

"For the character training thing, I think this research is built on fine-tuning about 7 billion parameter models with LoRA, which is essentially only fine-tuning a small subset of the weights of the model."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:13:43 • Jan 2026

Stable Diffusion mentions software

"And listeners may know diffusion models from image generation, like Stable Diffusion popularized it."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:29:48 • Jan 2026

GANs mentions technique

"There was a paper on generating images. Back then, people used GANs, Generative Adversarial Networks."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:29:56 • Jan 2026

BERT mentions software

"It's kind of similar to the BERT models by Google. Like, when you go back to the original transformer, they were the encoder and the decoder."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:30:23 • Jan 2026

Gemini Diffusion mentions software

"But there was an announcement by Google, a site where they said they are launching Gemini Diffusion, and they put it into context of their Gemini Nano 2 model, and they said basically: for the same quality on most benchmarks, we can generate things much faster."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:32:32 • Jan 2026

Gemini Nano 2 mentions software

"they put it into context of their Gemini Nano 2 model"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:32:40 • Jan 2026

Apple Foundation Models mentions software

"Like what Apple tried to do with the Apple Foundation models, putting them on the phone, where they learn from experience."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:42:25 • Jan 2026

DeepSeek-V3.2 mentions software

"DeepSeek-V3.2, where they had a sparse attention mechanism where they have essentially a very efficient, small, lightweight indexer"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:48:56 • Jan 2026

World Models mentions technique

"There was a paper by Meta, a paper called World Models. So where they basically apply the concept of world models to LLMs again"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:52:03 • Jan 2026

CASP mentions other

"There is a competition called CASP, I think, where they do protein structure prediction"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:52:51 • Jan 2026

AlphaFold mentions software

"AlphaFold, when it came out, it crushed this benchmark"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:53:15 • Jan 2026

RTX mentions technique

"There's some work in this area like RTX, I think it was a few years ago, where people are starting to do that"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:55:45 • Jan 2026

AI2027 report mentions media

"I don't know if you like the originally titled AI27 report. They focus more on code and research taste, so the target there is the superhuman coder"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:01:14 • Jan 2026

Harmonic mentions software

"I think there are startups—maybe Harmonic is one—where they're going all in on language models plus Lean for math"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:14:23 • Jan 2026

Lean mentions software

"language models plus Lean for math"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:14:27 • Jan 2026

Reflection AI mentions other

"We hear about Reflection AI, where they say their two billion dollar fundraise is dedicated to building US open models"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:52:52 • Jan 2026

Scale AI mentions software

"Scale AI for almost $30 billion and countless other deals like this"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:36:46 • Jan 2026

Vera Rubin mentions product

"That's why part of what Vera Rubin is- where they have a new chip with no high-bandwidth memory, which is one of the most expensive pieces"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 4:01:51 • Jan 2026

CUDA mentions software

"The moat of NVIDIA is probably not just the GPU. It's more like the CUDA ecosystem, and that has evolved over two decades"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 4:00:42 • Jan 2026

Google TPUs mentions product

"Like, Google obviously can make TPUs"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 4:02:26 • Jan 2026

AlexNet mentions technique

"I think it only happened because you could purchase those GPUs."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 4:06:26 • Jan 2026

Transformer mentions technique

"The word 'transformer' could still be known. I would guess that deep learning is definitely still known, but the transformer might be evolved away from in 100 years with AGI researchers everywhere."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 4:12:08 • Jan 2026

Black Forest Labs mentions software

"They're signing licensing deals with Black Forest Labs, which is an image generation company"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:43:38 • Jan 2026

Anthropic mentions software

"A lot of researchers at these companies are so well-motivated, and definitely Anthropic and OpenAI culturally want to do good for the world."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:26:50 • Jan 2026

OpenAI mentions software

"A lot of researchers at these companies are so well-motivated, and definitely Anthropic and OpenAI culturally want to do good for the world."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:26:50 • Jan 2026

Slack mentions software

"You want to add a new tab in Slack that you want to use, and I think AI will be able to do that pretty well"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:09:02 • Jan 2026

Spotify mentions software

"my wife the other day—she has a podcast for book discussions, a book club, and she was transferring the show notes from Spotify to YouTube, and then the links somehow broke."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:33:10 • Jan 2026

Microsoft Word mentions software

"take something like Slack or Microsoft Word. I think if organizations allow it, AI could very easily implement features end-to-end"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:08:45 • Jan 2026

Midjourney mentions software

"signing licensing deals with Black Forest Labs, which is an image generation company, or Midjourney"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:43:42 • Jan 2026

Gamma mentions software

"Let's throw in Mistral AI, Gemma..."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 29:01 • Jan 2026

Amazon Trainium mentions product

"Amazon is making Trainium"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 4:02:30 • Jan 2026

YouTube mentions software

"my wife the other day—she has a podcast for book discussions, a book club, and she was transferring the show notes from Spotify to YouTube, and then the links somehow broke."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:33:10 • Jan 2026

TikTok mentions software

"We see this with TikTok. You open it... I don't use TikTok, but supposedly in five minutes the algorithm gets you. It's locked in."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:25:05 • Jan 2026

Substack mentions software

"For example, if you read a Substack article, I could maybe ask an LLM to give me opinions on that, but I wouldn't even know what to ask."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 1:21:04 • Jan 2026

Groq mentions product

"We are starting to see some types of consolidation with Groq for $20 billion"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:36:43 • Jan 2026

Perplexity mentions software

"I think there will be some other multi-billion dollar acquisitions, like Perplexity"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 3:38:34 • Jan 2026

ChatGPT memory feature mentions software

"ChatGPT has a memory feature, right? And so you may have a subscription and you use it for personal stuff, but I don't know if you want to use that same thing at work."

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 10:02 • Jan 2026

Hugging Face mentions software

"I think when I was at Hugging Face, I was trying to get this to happen, but it was too early. It's like these open robotic models on Hugging Face"

From: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 • ▶ 2:55:18 • Jan 2026

Sebastian Raschka

Created by Sebastian

Top picks

All products

Products & gear

Software & tools

Techniques & practices

Recent episodes

All recommendations

Sign in to follow creators