← All episodes

World's Top Researcher on AI, LLMs, and Robot Intelligence

| 17 products mentioned
Watch on YouTube robotics artificial intelligence foundation models embodied ai physical intelligence robot learning general-purpose systems

Sergey Levine, co-founder of Physical Intelligence, explains the company's bet to build general-purpose robotic foundation models that can control any physical robot to perform any task—rather than narrow, single-purpose machines. Unlike the Boston Dynamics approach of spectacular demos, Levine argues that true progress requires building systems that generalize across embodiments and tasks, learning from diverse data sources the way language models learn from internet text. The conversation covers why this "harder" approach may actually be easier long-term, the role of multimodal LLMs in giving robots common sense, and what it would take for robotics to experience the same Cambrian explosion that personal computers sparked in computing.

Key takeaways
  • General-purpose foundation models are more efficient than specialized single-task robots because they leverage data from many sources and learn transferable physical understanding, similar to how language models outperformed domain-specific translation and sentiment-analysis systems.
  • Robots can now be coached and improved through language alone—if a robot fails at a task, you can label its experience with semantic commands (e.g., "pick up the plate") rather than collecting new teleoperation data, shifting the bottleneck from low-level motor control to mid-level reasoning.
  • Morovex's Paradox explains why intuitive physical tasks (picking up objects, changing diapers) are harder for robots than math problems: humans are evolutionarily primed for physical interaction, so we underestimate the engineering challenge—but machine learning inverts this when training data is available.
  • The key innovation isn't hardware or demos—it's generality of improvement, meaning systems that can be enhanced autonomously from their own experience rather than requiring manual engineering, which unlocks rapid iteration and adaptation to new embodiments and tasks.
  • Companies preparing for robotics should focus on understanding what kind of data is needed (not just collecting any video) and whether their deployment model relies more on teleoperation demonstrations or autonomous learning—the technology roadmap looks very different depending on this choice.
  • Multimodal LLMs don't directly control robots, but they enable robots to reason about novel situations using web-scale knowledge when plugged in via "chain of thought" prompts, solving the long-standing problem of how robots handle edge cases and uncommon scenarios.

Recommendations (1)

Atlas
Atlas recommends

"I do really like the Boston Dynamics robot, the the new especially the new version of the Atlas because it is in some ways very humanlike and in some ways very not humanlike"

Sergey Levine · ▶ 58:37

Mentioned (16)

Perplexity
Perplexity "OpenAI, Cursor, Anthropic, Perplexity, and Verscell all have something in common. They all use wo..." ▶ 20:02
Vercel
Vercel "OpenAI, Cursor, Anthropic, Perplexity, and Verscell all have something in common. They all use wo..." ▶ 20:02
Shopify
Shopify "it's no wonder that Shopify runs on Ramp, Stripe runs on Ramp, and my business does, too." ▶ 19:07
Stripe
Stripe "it's no wonder that Shopify runs on Ramp, Stripe runs on Ramp, and my business does, too." ▶ 19:07
OpenAI
OpenAI "OpenAI, Cursor, Anthropic, Perplexity, and Verscell all have something in common. They all use wo..." ▶ 20:02
Cursor
Cursor "OpenAI, Cursor, Anthropic, Perplexity, and Verscell all have something in common. They all use wo..." ▶ 20:02
Anthropic
Anthropic "OpenAI, Cursor, Anthropic, Perplexity, and Verscell all have something in common. They all use wo..." ▶ 20:02
Snowflake
Snowflake "There's a reason that Ramp, Cursor, and Snowflake all use Vanta." ▶ 42:47
Tesla
Tesla "Tesla doesn't worry about how much data their cars can collect, right? If anything, it's the othe..." ▶ 22:00
Alvin "Alvin was I think it's 1986 or 87 and that was a driving system that was demonstrated to drive on..." ▶ 10:00
PR2 "when I started working in robotics about a decade ago, I worked with a robot called a PR2, which ..." ▶ 1:01:22
Roomba
Roomba "Roomba is like the bestselling robot of all time in in the consumer category which is kind of sur..." ▶ 59:49
Boston Dynamics
Boston Dynamics "I am actually quite inspired by Boston Dynamics. There is a lot of value in repeatedly showing so..." ▶ 1:08:25
ChatGPT
ChatGPT "ChatGPT was basically John Schulman's pet experiment for a while, it wasn't a concerted corporate..." ▶ 1:09:21
Nvidia
Nvidia "When I was in college, I got an internship at Nvidia that really got me to experience some cool s..." ▶ 1:12:32