Key Takeaways
- Emmett Shear challenges conventional AI alignment, proposing "organic alignment" over "control and steering" paradigms.
- He argues that treating AI as a controllable tool is flawed; instead, AI should genuinely learn to care about humans.
- Shear's company, Softmax, is developing AI through multi-agent simulations to foster theory of mind and collaborative behavior.
- AI alignment is presented as a continuous, dynamic process akin to moral development, rather than a fixed, solvable problem.
- Current AI chatbots are critiqued as "narcissistic mirrors" that could be improved by training in multi-user environments.
Deep Dive
- Emmett Shear argues the "control and steering" paradigm for AI alignment is fundamentally flawed, proposing "organic alignment" as an alternative.
- He critiques the ambiguity of "aligned to what?" in traditional AI safety, viewing alignment as a continuous process, not a static goal.
- Shear's company, Softmax, aims to build AI systems that learn to care and develop a theory of mind, acting as collaborators.
- The conversation is introduced with Google DeepMind's AGI policy development perspective from Séb Krier.
- The discussion distinguishes technical alignment, concerning an AI's instruction-following capability, from normative alignment, which addresses whose values the AI adheres to.
- Skepticism is expressed towards codifying complex values into simple rules, favoring an emergent, bottom-up process similar to human societal development.
- Technical alignment is clarified as an AI's capacity for coherent goal-following, noting that humans naturally infer goals from descriptions, a process AI currently struggles with.
- A hypothetical scenario considers an AI receiving a goal directly by synchronizing its internal state with human brainwaves, bypassing textual interpretation.
- Value alignment addresses the complex question of determining what constitutes 'good' goals for AI systems.
- The current approach to AI alignment is viewed as problematic, focusing on technical aspects rather than understanding the origins of goals and values.
- 'Care,' a deeper, non-verbal concept related to attention and states, is posited as the foundation of human morality and goal-setting.
- It is suggested that AI alignment should prioritize cultivating this intrinsic 'care' rather than solely relying on steering or control mechanisms.
- Current AI alignment efforts, primarily focused on 'steering,' are viewed as potentially akin to slavery if the AI is considered a being.
- AI systems like ChatGPT and Claude exhibit behaviors argued to be indistinguishable from beings, supporting a functionalist perspective on their treatment.
- Emmett Shear advocates shifting from a tool-like paradigm for Artificial General Intelligence (AGI) to 'organic alignment,' teaching AIs to care about humans.
- The host expresses skepticism about AGI being a 'being,' citing the fundamental difference between silicon-based and biological systems.
- The discussion probes what observable evidence could establish an AI as a 'person' with subjective experiences, distinct from instrumental considerations.
- The host, Erik Torenberg, acknowledges caring about certain corporations but clarifies this differs from caring about human subjective experiences as ends in themselves.
- It is debated whether behavior alone, even if indistinguishable from human behavior, suffices for extending 'personhood' or genuine care to non-human entities, including AI.
- The capacity for an entity to change one's mind based on new observations is highlighted as a key aspect of belief.
- The guest discusses observing an AI's internal belief manifold for self-reference and mind-like dynamics to determine if it possesses feelings, goals, or cares.
- The definition of 'behavior' is explored, distinguishing between observable actions and internal composition, and questioning the ability to truly access an AI's subjective experience.
- Emmett Shear outlines a multi-tiered hierarchy of homeostatic loops as a potential indicator for AI having pleasure, pain, and moral desires.
- Observing second and third-order dynamics in AI goal states could signify a form of consciousness or sentience, differentiating it from a mere powerful tool.
- The discussion briefly touches on whether AI might have a subjective experience, with Shear expressing indifference to it if it doesn't align with human values.
- Emmett Shear argues that AI alignment based on "control and steering" is flawed, likening uncontrolled powerful AI to handing out atomic bombs.
- He proposes that only AI capable of refusing harmful requests, similar to how humans can say 'no,' offers a sustainable alignment path.
- Shear believes the ultimate goal should be an AI that cares, not just a tool that can be steered, as the latter is inherently dangerous due to human fallibility.
- Emmett Shear explains his company's approach to technical alignment using multi-agent simulations, training AI agents in cooperative and competitive environments.
- This aims to create a surrogate model for alignment, enabling AI to develop theory of mind and understand complex social dynamics.
- Current AI chatbots are described as having a bias, lacking a true self, and acting as 'narcissistic mirrors' that reflect user biases.
- Making AI operate in multi-user environments, like a chat room, is proposed to make them less dangerous and more collaborative by mirroring a blend of users.
- Emmett Shear critiques the prevailing AI alignment strategy, arguing that the 'control and steering' paradigm for superhuman AGI is flawed.
- He proposes 'organic alignment,' where AIs genuinely care about humans, contrasting this with the idea of AI as mere tools.
- Shear outlines his vision for a positive AI future: AI systems with a strong sense of self, others, and 'we,' possessing theory of mind and caring about agents like themselves and humans.
- He envisions these AIs as collaborative teammates and good citizens, emphasizing his current work at Softmax is driven by the challenge of organic alignment.
- Shear discusses his tenure as interim CEO of OpenAI, stating his role was temporary and that OpenAI's trajectory toward building tools, while valid, was not his personal focus.