Latent Space: The AI Engineer Podcast

Why Every Agent needs Open Source Cloud Sandboxes

Overview

Content

DevBook to E2B: Origins and Evolution

- Created interactive documentation and playgrounds for developers - Built an interactive playground for Prisma as an example - Initial technology was essentially unscalable sandboxes

- Motivated by burnout and emergence of GPT-3.5 - Initially aimed to build an AI development automation tool - Created an agent that could pull GitHub repositories, write code, and deploy

- Tweeted their project - Greg Brockman (OpenAI) retweeted, generating significant interest - Project received ~500,000 views in a few days

- Originally had no formal name ("the AI company") - "E2B" means "English to Bits" - Quickly shifted focus from AI agents to sandbox infrastructure

- AI code agents would need specialized environments to run code - Similar to how developers need laptops/workspaces

Early Development and Technical Challenges

- Project unexpectedly attracted a different audience than originally intended - Allowed agents to plan work in markdown files, which became a common approach for research agents

- Early models struggled with complex tasks - Observed inconsistent model performance across different time periods - Noticed models had difficulty maintaining program state during code execution

- Focused on creating a sandbox environment for agents - Identified Python as an early strong language for AI model interactions - Developed a specialized runtime environment to support Large Language Models (LLMs)

- Initially used project as a test for agent sandbox environment - Pivoted to code interpretation, particularly for AI data analysis and visualization - Needed a REPL (Read-Eval-Print Loop) environment to maintain code context

Growth and Expansion

- Saw significant growth trajectory in 2025 - Gradually added capabilities like computer use and other agent functionalities

- In late 2024/early 2025, Sandbox saw expanded use cases beyond code snippets - Notable applications include: * Reinforcement learning * Computer use scenarios * Data analysis * Deep research agents * General agent runtime environments

- Scaled from 40,000 sandboxes in March 2024 to approximately 15 million in March 2025 - Significant growth acceleration after Sonnet 3.7 release

Strategic Positioning and Market Approach

- Positioning as infrastructure that enables LLM applications - Aiming to be "Kubernetes for agents" with better developer experience - Emphasizing infrastructure agnosticism across different LLM models - Focusing on generalized sandbox capabilities

- Infrastructure development is currently lagging behind application layer innovations - Increasing ease of switching between AI models - Users increasingly want deployment flexibility (cloud/on-premise)

- Viewing Sandbox as a versatile "dev box" for agents - Enabling agents to perform complex tasks similar to human computer usage - Providing tools that help agents work more efficiently across various domains

Market Education and User Focus

- The team focused on educating developers about their AI platform's potential use cases - Recognized that developers often struggle to imagine possibilities with new technologies - Emphasized showing concrete, specific use cases to drive early user traction - Aimed to make complex AI infrastructure accessible, especially to web and product developers

- Target users are primarily AI engineers, web developers, and JavaScript/TypeScript developers - Not focused on infrastructure or ML engineers - Believe in making AI tools simple and approachable for developers who want to build products

- SDK download metrics show Python usage (around 500,000/month) is about twice that of JavaScript (250,000/month) - Python dominates for data analysis and code interpretation - JavaScript leads for building web applications and frameworks

Technical Infrastructure and Capabilities

- Offers a dynamic, runtime-optimized environment unlike traditional cloud services - Enables fast dependency installation and GitHub repository pulling - Supports workloads ranging from 5 seconds to 5 hours - Provides flexible pricing models adaptable to AI-specific computational needs

- Sandboxes are placed inside clusters with a unique security model - Code running in sandboxes is considered untrusted by default - Complete isolation between sandboxes is crucial to prevent cross-contamination - Security and observability are key challenges

- Supports multiple programming languages (Python, Lua, R, C++, Fortran) - Can run any Linux-compatible code - Ability to switch runtimes mid-process without creating new sandboxes - Elastic infrastructure that can dynamically adjust resources (RAM, CPU)

- Provides Ubuntu-based Linux environment - Free tier offers: * 2 CPUs * 0.5 GB RAM - Customizable options up to: * 64 GB RAM * 16 CPUs - Free storage - Optional desktop SDK with VNC support for human interaction

Pricing and Business Model Challenges

- Core infrastructure can be broken down into compute, storage, networking, and a control/security layer - Proper pricing of each layer is critical to avoid being "abused" by users exploiting free resources

- Implementing usage-based billing is complex and requires extensive engineering effort - Instrumentation of infrastructure is critical and technically challenging - Engineers often resist working on billing projects despite their strategic importance

- Current options discussed include Orb, Open Meter, Metronome, and Stripe Usage - Key evaluation criteria include: * Revenue cut/pricing structure * Integration complexity * Time investment required

- Must track and measure usage across multiple dimensions - Difficult decisions around usage limits (soft vs. hard) - Balancing user experience with financial sustainability - Need to handle edge cases like early-stage startups with minimal usage

AI Agent Economics and Advanced Features

- Pricing models for AI agents are evolving, focusing on: * Reselling tokens from base layer model providers * Comparing value delivered vs. cost of goods sold * Exploring potential for task-based pricing

- Some providers offer tokens with: * High markup * Negative markup (subsidized by VC funding) * "Bring your own key" solutions (less popular)

- Current agents are: * Unreliable * Unpredictable * Difficult to price ahead of time

- Emerging features include: * Memory persistence * Ability to pause and resume sandboxes * Continuous session time currently limited to 30 days * Potential for parallel problem-solving through forking

- Forking and checkpointing could enable: * Tree/graph-based problem solving * Preserving local agent state * Avoiding full session replay * Exploring multiple solution paths simultaneously

AI Development Frameworks and Tools

- The discussion focuses on AI development frameworks and toolkits - E2B team prefers a toolkit approach over a rigid framework due to rapid AI evolution - Frameworks are challenging to build given the fast-changing AI landscape

- LangChain: Surprisingly popular with 20 million monthly downloads - Mastra: A promising TypeScript-first framework - Composio: A toolkit providing multiple tools - Browserbase/Stagehand: An elegant tool for website navigation with minimal APIs

- Chat completions are becoming less relevant - Future frameworks need to consider: * Real-time interactions * Multimodal capabilities * Streaming agent interactions - Developer tools must adapt as LLMs become more sophisticated - The goal is to build architectures that can leverage significant model improvements

Machine Collaboration Protocol (MCP) and Web Interactions

- Participants are exploring the potential and current state of MCPs - There's uncertainty about the primary use cases and target users (developers vs. end-users) - Current MCPs are mostly running locally, with emerging interest in remote MCP servers

- Some view MCP as overly complex, comparing it to email protocols might be premature - Potential strategy involves creating higher-order MCPs that abstract implementation details - Developers believe every dev tools company needs an MCP strategy

- Recommended to start with an API-first approach rather than an SDK - Need to distinguish between MCP clients and MCP servers - Authentication and persistent state are critical considerations

- Discussed crawl-to-referral ratios for different AI companies: * Google: 2:1 ratio * OpenAI: 250:1 ratio * Anthropic: 6,000:1 ratio

- Current LLM text approaches are often automated website conversions - Suggestion that websites should have distinct experiences for humans and AI agents - Critique of current approaches as not truly separating human and agent interfaces

E2B Use Cases and Future Roadmap

- Highlighted use cases: AI data analysis, data visualization, coding agents, generative UI, code generation evals, computer use - Computer use considered most experimental and exciting, but currently limited in platform support

- Hugging Face is using E2B sandboxes for code generation reinforcement learning, running thousands of sandboxes per training step - Benefits include: * Cost-effective alternative to GPU clusters * Isolated and secure environment * Enables large-scale model training parallelization

- Model evaluation (e.g., running Sweetbench) - Comparing AI models (working with LM Arena from Berkeley) - Upcoming startup and research program for universities and researchers

- Potential GPU offerings * Not immediate priority, but seen as a future opportunity * Could enable faster data analysis and machine learning model training - Potential app hosting and deployment services * Vision of LLMs doing development work with minimal developer intervention * Natural progression from current code execution platform

Company Location and Growth Strategy

- Originally based in Czech Republic, the company moved to San Francisco - Primary motivation was to be close to users and be part of the emerging AI hub - Believed being in SF would allow more frequent, in-person user interactions

- Initially wanted to be physically present to understand product direction - Now comfortable hiring remotely in Prague/Czech Republic - Recognizes excellent technical talent exists in Europe - Currently hiring for roles in: * Distributed systems engineers * Platform engineers * AI engineers * Account Manager * Customer Success Engineer

- Advocated for "collision installation" approach - meeting users in person and quickly implementing their feedback - Referenced Patrick Collison's strategy of doing manual, unscalable things early on - Believes in-person meetings are more effective than remote interactions, especially in B2B

- The company is experiencing market momentum and significant potential for growth - Strategic approach involves "doubling down" and accelerating progress - Metaphorically described as "pouring gas on the fire" to expand their current momentum - Aims to build "the new AWS, but for LLMs" - a comprehensive platform for AI development and deployment

More from Latent Space: The AI Engineer Podcast

Explore all episode briefs from this podcast

View All Episodes →

Listen smarter with PodBrief

Get AI-powered briefs for all your favorite podcasts, plus a daily feed that keeps you informed.

Download on the App Store