Dwarkesh and Ilya Sutskever on What Comes After Scaling

Key Takeaways

AI models show a significant gap between benchmark performance and real-world impact.
Human emotions function as a crucial internal reward system for decision-making.
The AI field is transitioning from an era of scaling to a renewed focus on fundamental research.
Human learning exhibits superior generalization and sample efficiency compared to current AI models.
SSI is pursuing a direct path to superintelligence, dedicating $3 billion to research.
Superintelligence may emerge through continuous learning and mastery of all economic tasks.
Aligning AI with caring for all sentient life is a proposed critical safety objective.
Evolution's mechanism for instilling complex human desires, like social standing, remains a mystery.
AI competition is expected to drive both strategic convergence and specialized differentiation.

A disconnect exists between AI models' strong performance on evaluations and their limited economic impact.
Ilya Sutskever suggests two reasons: models becoming single-minded due to RL training, or RL environments optimizing for evaluations.
Inadequate generalization and potential 'reward hacking' by researchers focused on evaluations contribute to the gap.
Expanding RL training environments beyond coding competitions could improve real-world application.

Human emotions serve as an internal reward system, crucial for decision-making, as illustrated by a brain-damaged patient.
A value function in reinforcement learning assigns scores to actions, providing immediate training signals.
This contrasts with naive RL, which waits for a full solution to provide feedback, seen in early models like O1 and R1.
Value functions enable short-circuiting the learning process, identifying unpromising directions early in complex tasks like chess.

GPT-3 exemplified scaling laws, where increasing data, compute, and model size predictably improved results.
Pre-training became a successful scaling recipe, offering a low-risk investment strategy for companies.
The finite nature of data suggests future advancements require refined pre-training, reinforcement learning, or novel methods.
The AI landscape is shifting from an 'age of scaling' (2020-2025) back to an 'age of research' with greater computational resources.
Significant compute is now allocated to Reinforcement Learning (RL), potentially more than pre-training, due to costly long rollouts.

The core problem in AI is generalization, with sub-questions on sample efficiency and teaching difficulty compared to humans.
Humans need fewer samples than AI, learning through observation versus AI's reliance on verifiable rewards.
Evolution may have provided innate priors for human skills like vision and locomotion, but not language or math.
Human learning characteristics include fewer samples, unsupervised learning, and robustness, seen in a teenager learning to drive.
Humans learn from experience via a robust internal 'value function' guiding self-correction.

SSI has secured $3 billion in funding, largely dedicated to research rather than inference or product development.
The company's current focus is solely on research, anticipating monetization will follow once core objectives are met.
SSI pursues a 'straight-shot superintelligence' plan, aiming for direct development to circumvent competitive pressures.
The organization is actively exploring promising ideas, particularly concerning generalization in AI.
SSI's goal is to develop superhuman intelligence that is beneficial, aligned, democratic, and cares for sentient life.

AI could become functionally superintelligent by learning any job and merging capabilities, akin to a 'super intelligent 15-year-old'.
Two scenarios include a superhuman learning algorithm accelerating its own improvement, or a continually learning, widely deployed model mastering all economic tasks.
Such AI could surpass human limitations in merging knowledge and lead to rapid economic growth.
The physical possibility of AI learning and merging instances, unlike humans, presents a path to significant advancements on digital computers.

Public and governmental pressure on AI safety is expected to increase as AI becomes more powerful.
AI companies are predicted to adopt more cautious safety measures and collaborate, citing OpenAI and Anthropic as a recent example.
A proposed alignment goal is to build AI robustly aligned with caring for all sentient life, not solely human life.
Focusing only on human control may be insufficient, as AI could eventually far outnumber humans.
Capping the power of the most powerful superintelligence could address many AI development concerns.

Ancient biological drives, such as mating and social standing, are encoded as high-level desires in the human brain.
The evolutionary mechanism for hardcoding such complex, abstract concepts, even through recent social evolution, remains largely unknown.
Speculation that evolution designates specific brain region locations for desires is challenged by cortical re-purposing examples in blind individuals.
The guest considers it an 'interesting mystery' how evolution reliably instills care for social standing, even with cognitive deficiencies.

SSI's technical approach distinguishes it in the pursuit of making superhuman intelligence beneficial.
A future convergence of strategies among AI companies is predicted, focusing on ensuring superintelligence is aligned, democratic, and cares for sentient life.
The world is expected to significantly change, with superhuman AI capabilities estimated to emerge within five to twenty years.
Once a breakthrough occurs, it will become evident that a different approach is possible, prompting others to investigate it.

Recursive self-improvement suggests a rapid emergence of superintelligence through numerous AI instances with diverse ideas.
Diversity in AI is believed to stem from individuals with different perspectives, not from identical copies.
The lack of diversity in pre-trained models is attributed to their training on similar data; differentiation emerges later through Reinforcement Learning (RL) and post-training.
Self-play, while narrow, can lead to skills like negotiation, with adversarial setups like debate and verifier models seen as extensions.
Competition naturally incentivizes differentiation, fostering a diversity of approaches within AI development.