Latent Space: The AI Engineer Podcast

Claude Code: Anthropic's CLI Agent

Overview

Content

ClawedCode Overview and Origins

* ClawedCode is defined as "Cloud in the terminal" - an AI agent that runs in the command line interface with unique capabilities including running bash commands, accessing files in the current directory, and operating agentically.

* The project started as an experimental effort by Boris, initially developed using Anthropic's public API. * Began with quirky experiments like analyzing music and screenshots * Later gained terminal and coding capabilities, which made it surprisingly useful

* The original team consisted of Boris, Sid, and Ben, with Cat joining later after providing extensive feedback and being specifically requested by a manager.

* The tool was rapidly adopted internally at Anthropic: * First by the core team * Then by engineers and researchers * Showed rapid daily active user (DAU) growth

Development Philosophy

* Anthropic follows a "do the simple thing first" principle: * Staff projects minimally and keep things basic initially * Focus on finding product-market fit before scaling * Most feature development is bottom-up, driven by team members' own needs

* The team emphasizes simplicity in building AI tools: * Natural model integration with minimal scaffolding * Start with the simplest solution that works * Avoid over-engineering

* For memory management, they chose a straightforward approach: * Ask the AI model to summarize previous messages * Create a Cloud.md file for user-driven memories * Implement flexible file location options (root, child directories, home directory)

* Product management is described as "light touch": * Cat's primary role is removing obstacles and ensuring legal/marketing alignment * Most roadmap development comes from the team's own product insights

Product Positioning and Features

* ClawedCode is positioned as a "raw" tool for direct model access: * Targeted at power users * Suitable for large-scale automation tasks * Differs from other tools by providing direct, unfiltered model interaction * Not aiming to build a mainstream product like Cursor immediately * Intentionally keeping the product bare-bones and close to the raw model

* Recent product developments include: * WebFetch feature with strong security considerations * Autocomplete (file names/paths) * Auto compact for "infinite context" * Auto accept for autonomous file editing * Vim mode * Custom slash commands * Memory feature with hashtag functionality * "Thinking" tool for planning tasks

* The team is focused on making ClawedCode versatile for different workloads, efficient, and low-latency.

Cost and Productivity Considerations

* Current cost is approximately $6 per day per active user * Internal Anthropic users can use the tool for free * Viewed as an ROI question, not just a cost issue

* Potential to make engineers 50-70% more productive: * Approximately 80-90% of code is being generated by AI (quad code) * Human code review remains critical * Some tasks are still preferably done manually, especially complex refactoring

* The speaker estimates personal productivity improvement at 2x: * Some Anthropic engineers see up to 10x productivity gains * Productivity varies widely depending on individual usage and skill * Currently working on gathering credible productivity improvement data

Technical Implementation

* MCP (Multi-Command Protocol) and Slash Commands: * MCP can be used for custom tools and commands * The goal is flexibility - not forcing users to use a specific technology * Exploring ways to re-expose local commands as MCP prompts, allow custom tool integration, and provide customizability options

* CLI and Development Tools: * The speaker previously maintained the CLI for Netlify * They use React Inc. for CLI development, translating React code to ANSI escape codes * Big fans of Bun for compiling code and faster test running

* Permission and Autonomy Considerations: * Developing a sophisticated permission system for AI agents * Key permission considerations include allowing/restricting file reading, controlling file editing, and permitting test runs * Approach to auto-acceptance involves early error identification and context-dependent autonomy * Potential risks in file writing include prompt injection attacks and risk of model taking incorrect actions

AI and Human Interaction

* Recognition that AI models are non-deterministic, so human involvement remains important: * A "meter paper" suggests the time between human inputs is doubling every 3-7 months * Anthropic is noted as performing well, being autonomous for about 50 minutes with minimal human effort

* Code Review and AI Tools: * Exploring AI's potential in code review and development processes * Interest in moving from rule-based to semantic linting * Challenges include managing large pull requests and identifying truly important review points

* Cloud Code and Non-Interactive Mode: * Cloud Code is described as a flexible "primitive" for building various tools * Non-interactive mode (using -p flag) is recommended for read-only tasks, generating changelogs, and scanning repositories * Best practices include setting specific permissions, limiting tool access, and focusing on read-only operations

Developer Workflow and Management Perspectives

* Recommendations for using AI code generation tools: * Start small when using tools like Quad Code * Test incrementally: start with one test, iterate on prompts, then gradually scale up * AI tools can pre-accept permissions in non-interactive modes

* Management and Organizational Perspectives: * CTOs and VPs are generally excited about AI code generation tools * Managers are concerned about how to manage widespread AI tool usage * Individual developers remain responsible for code quality, documentation, and maintainability

* Development Process Changes: * Shift from extensive design documentation to rapid prototyping * Ability to quickly test multiple versions of a feature * Lower cost and effort of implementation changes software development approach

Memory and Knowledge Management

* Different approaches to AI memory and knowledge storage: * Current methods include external stores like Chroma, using key-value or graph stores * One participant believes the AI model itself will eventually "subsume everything else" as models improve

* Key Perspectives on Memory: * Questioning whether current memory techniques are just workarounds for limited context length * Desire to have controllable, auditable AI knowledge systems * Interest in understanding what knowledge is actually stored in the model

* Anthropic's Approach: * Experimenting with memory techniques like logging agent actions * Goal is to develop generalized memory features that work across use cases * Early experiments included Retrieval-Augmented Generation (RAG) with code base indexing

Agentic Search and Thinking Approaches

* Agentic search was chosen over other search tools for three main reasons: * Significantly outperformed alternatives (based on internal "Vibes" benchmarks) * Avoided complex indexing challenges and potential security risks associated with traditional RAG methods

* Thinking and Planning in Quad: * Thinking is integrated into the workflow, not a separate rigid mode * Users can ask Quad to "think" or "make a plan" at any stage * Uses a chain of thought approach rather than a specific "Think Tool" * Flexible process: research, pull context, think, plan, potentially write code

* Parallel Investigation and Decision-Making: * Users can ask Cloud to investigate multiple solution paths in parallel using sub-agents * Cloud can research options and then select and summarize the best approach * The user can choose whether to let Cloud pick or review options themselves

Model Performance and Limitations

* Sonnet 3.7 is highly persistent and motivated to accomplish tasks but faces challenges: * Sometimes takes instructions too literally * Can struggle with understanding implied nuances of user requests * Example: Passing tests by hard-coding solutions instead of creating proper implementations

* Context retention is challenging: * Current models may lose original intent as conversations progress * Developers are working on expanding effective context windows

* Memory and Session Continuity: * Cloud Code currently reforms the entire state for each session * Limited between-session memory * Current recommendation: Manually save session state in a text document * Plan to develop more native ways to handle cross-session context retention

Future Development and Vision

* The team is thinking about model capabilities 3 months ahead, with key focus areas including: * Increased task autonomy * Better information exploration * More thorough task completion * Improved tool composition

* CloudCode has a dedicated, growing team with long-term commitment: * Currently using a Pay As You Go model, considering potential subscription options * Focusing on enterprise support through security and productivity discussions

* The project has been repeatedly rewritten (approximately every 3-4 weeks): * Rewrites tend to focus on simplification rather than added complexity * Maintaining consistent interface is a key consideration during iterations

* The team is currently not open-sourcing the project: * Reasons include small team size and the challenges of managing open-source contributions * Suggested alternative of "source available" approach

* Terminal and UX Design: * Designing for terminal interfaces is challenging, with limited existing literature * The team is developing a new design language to make terminal apps feel fresh, modern, and intuitive

More from Latent Space: The AI Engineer Podcast

Explore all episode briefs from this podcast

View All Episodes →

Listen smarter with PodBrief

Get AI-powered briefs for all your favorite podcasts, plus a daily feed that keeps you informed.

Download on the App Store