Overview

AI engineering requires blending traditional software skills with a "fault first" mindset to build resilient systems that can handle the inherent unpredictability of language models, which exhibit extreme latency variations and inconsistent outputs.

Successful AI engineers must adopt distributed systems techniques (retries, fallbacks, timeouts) while balancing technical complexity with product focus, though implementing model fallbacks presents unique challenges due to cost considerations and prompt compatibility issues.

The ML-first mindset requires "unlearning" traditional software engineering patterns, accepting less direct control, and finding the balance between constraining models and allowing their creative capabilities to emerge.

When hiring AI engineers, companies should prioritize genuine curiosity about machine learning, strong fundamental engineering skills, and the ability to think defensively about edge cases rather than specific ML technical knowledge.

As an emerging field without established playbooks, AI engineering talent can be sourced through a combination of "raising the beacon" (visible content, open source, conferences) and authentic networking, with interview processes designed to simulate real-world challenges.

Content

Background and Introductions

James's Background and Transition to AI:

• Previously a VP of Technology/CTO for about 15 years • Became interested in AI and ML around 2019 • Spent a year learning about AI, reading papers, and practicing • Initially planned to start a startup in AI interpretability • Met Andreas, who invited him to run engineering at Elysset instead • Believes AI engineering is mostly traditional software engineering with a small specialized AI component

Adam's Background and Current Role:

• Founded Heroku • Recently worked at Muse • Currently describes himself as an "internal journalist" at Elysset • Interested in tools for thought and productivity software • Sees Elysset's work as potentially accelerating scientific discovery • Views his current role as a learning opportunity to understand AI applications • Co-organized the Local First Conf in Berlin

Podcast Context:

• This is a remote episode focused on discussing a guest post • Featuring James and Adam from Elysset • Hosted on the Latent Space podcast

Defining AI Engineering

Hiring AI Engineers:

• The role emerged organically rather than being created from scratch • Requires a blend of traditional engineering and new AI-specific skills

Three Critical Skills for AI Engineers:

1. Conventional software engineering skills 2. Curiosity and enthusiasm for machine learning/language models 3. A "fault first" mindset for building resilient systems

Unique Challenges of Language Models:

• Extreme latency variations (up to 10x in 30-60 minutes) • Unpredictable responses (format, content, semantics) • Requires engineering approaches from distributed systems to manage inherent chaos

Key Insights:

• AI engineering requires blending application development skills with distributed systems engineering techniques • The goal is creating stable, reliable user experiences despite working with fundamentally unpredictable technology • Successfully managing language models requires engineers who can: - Understand technical complexity - Maintain a product-focused mindset - Build robust systems around inherently unstable technologies

Technical Approaches to AI Engineering

Distributed Systems and Error Handling:

• Key approaches include: - Retries - Fallbacks - Timeouts - Careful error handling - Heavy reliance on parallelization

Technical Strategies:

• Strong typing across backend (Python) and frontend (TypeScript) • Sharing types via OpenAPI spec with automatic TypeScript type generation • Use of checked exceptions in Python to force comprehensive error handling

System Design and Fault Tolerance:

• Emphasis on thinking defensively and considering edge cases • For system design interviews, candidates are prompted to consider scenarios like node failures, network slowness, or capacity limitations • Similar defensive thinking applies to AI and language model systems

Challenges with AI/Language Model Fallbacks:

• Retries and fallback strategies can be costly due to high variance and expense of language models • Balancing user experience with infrastructure costs is a key consideration • Model fallback (e.g., switching between providers like OpenAI and Anthropic) is theoretically appealing but practically challenging

Practical Fallback Considerations:

• Different models require different prompts, which can impact performance • Fallback systems can easily become stale or unreliable • The appropriate fallback strategy depends on: - The specific application - The user context - The particular feature being used • In a search scenario, the team recently changed from providing degraded results to showing an error message • Some situations might allow falling back to another model with minimal performance loss • No universal rule exists for implementing fallbacks

Model Selection and Integration

AI Model Management:

• APIs between Anthropic and OpenAI models are becoming more similar, simplifying integration • The team uses multiple backend models for different tasks, not just a single model • Model and prompt selection is highly dynamic, changing weekly based on performance and capabilities

AI Platform Strategy:

• Smaller companies like Elysset prefer flexibility and direct control over model selection • Larger enterprises are exploring centralized AI platform gateways to: - Ensure data security - Control model/endpoint usage - Standardize prompt engineering - Manage billing

Challenges in AI Technology Management:

• Current AI technology landscape is rapidly evolving • Early standardization can be premature given the fast-changing nature of AI models • Larger organizations tend to seek standardization for comfort and control • Smaller, innovative teams need agility to experiment and iterate quickly • The speaker notes an interesting irony in advocating against early standardization, given his background founding Heroku (a platform that standardized early web development)

Evolution of AI Engineering

Perspectives on Technology Frameworks and Standardization:

• Frameworks like Rails emerged by identifying and standardizing common patterns in app development • Standardization requires accumulated experience and understanding of real-world needs • The AI/ML space currently feels like the "wild west" and may be too early for comprehensive standardization

Perspectives on AI/ML Technology Evolution:

• Rapid technological change makes long-term predictions challenging • Prompt engineering may not remain a durable differentiating skill • Setting up ML problems effectively will likely be more important than specific prompt techniques

Emerging Challenges in AI/ML Engineering:

• Dealing with unpredictable model behaviors • Managing latency variations • Need for a "defensive mindset" when working with adaptive, complex systems • Requirement to sanitize and validate AI-generated inputs, similar to traditional software security practices

Evaluating Language Models

Emerging Perspectives on LLM Development:

• Shifting from "code at the core" to "LLM at the core" architecture • New models introduce challenges like prompt injections and non-determinism • Recent model releases (like Claude 3.5) have limited technical documentation

Evaluation and Enthusiasm Approaches:

• Focus on model capabilities rather than detailed technical specifics • Key evaluation criteria include: - Performance on existing tasks - New emerging capabilities - Multimodality - Improvements in reasoning, metacognition, self-assessment, and confidence estimation

AI Engineering Qualities:

• Curiosity about new technological capabilities • Product-oriented mindset • Ability to: - Quickly assess new model capabilities - Understand potential user/customer benefits - Evaluate trade-offs (accuracy, speed, cost) - Identify novel use cases beyond simple model replacement

The ML-First Mindset

Technology and Product Approach:

• Emphasize connecting new technological capabilities directly to user needs and product goals • Avoid getting distracted by technical metrics or features without understanding practical application • Maintain a strategic mindset about how technology can help users

ML-First Mindset:

• Represents a different approach to software development • Requires "unlearning" traditional software engineering patterns • Involves: - Accepting less direct control - Working with potentially opaque ML systems - Adapting to probabilistic rather than deterministic outcomes

Key Insights about the ML-First Mindset:

• Traditional software development allows more predictable control, especially with databases and APIs • Language models introduce intrinsic uncertainty at their core • There's value in "letting go" and not over-constraining model capabilities

Personal Anecdote about Generating Text with Citations:

• Initially resisted less structured approach • Learned that giving models some flexibility can yield unexpected powerful results • Found a balance between providing guidance and allowing model creativity

Challenges of ML Engineering:

• Potential for model hallucinations • Risk of over-constraining or under-constraining model prompts • Need to find a "sweet spot" in model interaction

Philosophical Reflection on Engineering:

• Engineering requires both systematic control and creative openness • Computers and AI are canvases for creativity • Success involves balancing precision with open-ended exploration

Hiring and Sourcing AI Engineering Talent

Engineering Culture and Hiring:

• Value genuine curiosity about machine learning (ML) • Prioritize a self-starting, action-oriented approach • Look for candidates who: - Show interest in ML concepts - Engage with the company's ML vision - Ask insightful questions - Demonstrate a proactive learning attitude

Learning Resources:

• Recommended: Elicit's ML reading list • Suggested approach: Tiered learning resources • Emerging baseline knowledge (e.g., understanding "context" is becoming important)

Hiring Approach:

• No explicit ML-focused interview • Assess ML curiosity through cultural fit and interview conversations • Focus on fundamental engineering skills and mindset over specific ML technical knowledge

Interview and Skill Assessment:

• Traditional coding interviews often focus on the "happy path" • Challenging to interview for defensive coding and fault tolerance • Their interview process includes: - A coding exercise that requires thinking about edge cases - A system design whiteboarding component - Emphasis on candidates demonstrating an ability to handle complex error scenarios

AI Engineering Talent Spectrum:

• Two contrasting engineering personality types: - Experienced principal engineers (fault-tolerant, skeptical) - Early-career engineers (optimistic, exploratory) • Best candidates integrate qualities from both ends of the spectrum • Career stage is less important than ability to adapt and learn

Sourcing Strategies:

• Two-pronged approach to sourcing talent: - "Raise the beacon" - Make your work and opportunities visible - Outbound networking and engagement • Visibility tactics include: - Job fairs - Job descriptions - Blog posts - Open source releases - Conference/meetup participation - Active social media presence (especially Twitter) • Finding less experienced talent by looking for: - Blog posts - Twitter hot takes - Side projects - Hackathon participation - Challenges completed • Professional experience with language models is not always necessary • Networking approach: - Build genuine connections, not transactional relationships - Create ambient awareness of your work - Follow potential candidates and engage organically - Be alert to career transition moments • Recommended sourcing venues: - Hackathons - AI conferences - Technical meetups - Online platforms showcasing technical work

Hiring Strategies:

• Employer branding is crucial: highlight mission and team quality • Use targeted job boards specific to your field/mission (e.g., AI safety job boards) • Leverage specialized communities and networks for recruiting • Prioritize interview processes that simulate actual work environments • Focus on candidates' real-world capabilities, not just abstract problem-solving skills • Create interview experiences that allow candidates to evaluate the company as much as the company evaluates them

Concluding Thoughts

Emerging Talent Observations:

• Current young professionals appear more mature, capable, and professionally driven compared to previous generations • Identifying and attracting high-potential young talent is increasingly important

AI Engineering as an Emerging Field:

• AI engineering is an emerging field without established hiring playbooks • Emphasize practical, empirical skills over theoretical knowledge • Interview processes should reflect the real-world nature of AI engineering work

The speakers are acknowledged as pioneers in defining and articulating the role of AI engineers

The discussion explores the economic forces shaping AI engineering as a new professional domain

The primary motivation seems to be helping people understand and find opportunities in this emerging field, with a goal of job creation and professional connection

How To Hire AI Engineers — with James Brady & Adam Wiggins of Elicit