Applications for the 2025 AI Engineer Summit are up, and you can

Latent Space: The AI Engineer Podcast

Latent.Space 2024 Year in Review

Overview

Content

Podcast Context and Milestone

AI Engineering Trends and Industry Observations

Research and Production Insights

Conference Observations

Scaling and AI Development Debate

AI Model Landscape and Market Trends

- Gemini launching a price war with Gemini Flash (essentially free for personal use) - Claude 3's market entry and aggressive growth

Small Model Developments

Open Source AI Models

Business and Economic Insights

Agents and AI Progress

- Web browsing - Code interpreting - Memory/planning capabilities - Morph Labs' "time travel VM" for stateful code execution - Ability to fork and explore different execution paths - Need for more sophisticated code execution approaches

AI Research and Development Highlights

Reinforcement Learning and Language Models

Scaling and Evolutionary Insights

NeurIPS Highlights and Research Trends

Synthetic Data and Reasoning

Intellectual Property and Legal Challenges

- New York Times - Stack Overflow - Reddit - Getty - Various artists and writers - Scarlett Johansson

GPU Landscape and Trends

On-Device AI and Multimodal Developments

- Suno: Grew from 0 to $20 million ARR, runs training on Moto - Bolt: Announced $20 million ARR

Multimodality War Observations

- 11 Labs (unicorn status) - Pika Labs (launched Pika 2.0)

Video and AI Research

AI Project Growth and Trends

- Devin: Took 9 months to reach general availability, but showing improvement - Auto GPT: High interest due to promise of generality, but challenges in execution - Most AI projects focusing on specific, narrow use cases (code completion, PR reviews)

Memory and AI Systems

- Mostly do explicit summarization - Lack implicit preference extraction - Do not truly capture nuanced user interactions - Knowledge: Information about the world (external/internal) - Memory: Personal interaction history over time - Requires time-based decay and review functions

Benchmarks Evolution

- Sweebench - Livebench - MMU Pro - AIME

AI Capabilities Categorization

- General knowledge - Long context (100-200K tokens) - Retrieval-augmented generation (RAG) - Batch transcription - Code generation - Tool use - Vision language models - PDF parsing - Real-time transcription - Improving diarization capabilities - Potential use of Gemini 2.0 flash for transcription - Interesting but not yet ready for broad usage - Challenges in daily application - Potential for long inference and real-time API voice modes - On-device models still developing - Base models are underrated - Potential release of GPT-3 base model - Interest in state space models and RWKVs - Cartesia emerging as a competitor to 11 labs

Inference and Pricing Trends

- O1 preview - GPT 4.0 - O1 mini - Gemini flash

Key AI Releases and Developments in 2023-2024

- Q/Strawberry (September) - GPT-4 Turbo variants (One Pro and One Full) - Voice Mode - Canvas (document editing environment in October) - Chrome search extension - Document editing (challenging Google Docs) - Search capabilities - Workflow integration

OpenAI and Personnel Changes

Future of Work and AI

More from Latent Space: The AI Engineer Podcast

Explore all episode briefs from this podcast

View All Episodes →

Listen smarter with PodBrief

Get AI-powered briefs for all your favorite podcasts, plus a daily feed that keeps you informed.

Download on the App Store