Overview
- ClawedCode is an AI agent that runs in the terminal, allowing direct model interaction with capabilities to execute bash commands, access files, and operate autonomously. It started as an experimental project that gained rapid internal adoption at Anthropic.
- The development philosophy emphasizes simplicity and minimal scaffolding - starting with the simplest working solution before scaling, using straightforward approaches to memory management, and maintaining a "light touch" product management style that removes obstacles rather than dictating direction.
- The tool is positioned for power users and developers, intentionally kept "raw" and close to the model, with features like WebFetch, autocomplete, memory management, and autonomous file editing that potentially make engineers 50-70% more productive.
- Technical implementation balances autonomy with safety through permission systems that control file access and editing capabilities, while recognizing that human oversight remains essential as AI models are non-deterministic.
- The team envisions future development focused on increased task autonomy and improved information exploration, with a commitment to maintaining simplicity through regular rewrites that reduce rather than add complexity.
Content
ClawedCode Overview and Origins
* ClawedCode is defined as "Cloud in the terminal" - an AI agent that runs in the command line interface with unique capabilities including running bash commands, accessing files in the current directory, and operating agentically.
* The project started as an experimental effort by Boris, initially developed using Anthropic's public API. * Began with quirky experiments like analyzing music and screenshots * Later gained terminal and coding capabilities, which made it surprisingly useful
* The original team consisted of Boris, Sid, and Ben, with Cat joining later after providing extensive feedback and being specifically requested by a manager.
* The tool was rapidly adopted internally at Anthropic: * First by the core team * Then by engineers and researchers * Showed rapid daily active user (DAU) growth
Development Philosophy
* Anthropic follows a "do the simple thing first" principle: * Staff projects minimally and keep things basic initially * Focus on finding product-market fit before scaling * Most feature development is bottom-up, driven by team members' own needs
* The team emphasizes simplicity in building AI tools: * Natural model integration with minimal scaffolding * Start with the simplest solution that works * Avoid over-engineering
* For memory management, they chose a straightforward approach: * Ask the AI model to summarize previous messages * Create a Cloud.md file for user-driven memories * Implement flexible file location options (root, child directories, home directory)
* Product management is described as "light touch": * Cat's primary role is removing obstacles and ensuring legal/marketing alignment * Most roadmap development comes from the team's own product insights
Product Positioning and Features
* ClawedCode is positioned as a "raw" tool for direct model access: * Targeted at power users * Suitable for large-scale automation tasks * Differs from other tools by providing direct, unfiltered model interaction * Not aiming to build a mainstream product like Cursor immediately * Intentionally keeping the product bare-bones and close to the raw model
* Recent product developments include: * WebFetch feature with strong security considerations * Autocomplete (file names/paths) * Auto compact for "infinite context" * Auto accept for autonomous file editing * Vim mode * Custom slash commands * Memory feature with hashtag functionality * "Thinking" tool for planning tasks
* The team is focused on making ClawedCode versatile for different workloads, efficient, and low-latency.
Cost and Productivity Considerations
* Current cost is approximately $6 per day per active user * Internal Anthropic users can use the tool for free * Viewed as an ROI question, not just a cost issue
* Potential to make engineers 50-70% more productive: * Approximately 80-90% of code is being generated by AI (quad code) * Human code review remains critical * Some tasks are still preferably done manually, especially complex refactoring
* The speaker estimates personal productivity improvement at 2x: * Some Anthropic engineers see up to 10x productivity gains * Productivity varies widely depending on individual usage and skill * Currently working on gathering credible productivity improvement data
Technical Implementation
* MCP (Multi-Command Protocol) and Slash Commands: * MCP can be used for custom tools and commands * The goal is flexibility - not forcing users to use a specific technology * Exploring ways to re-expose local commands as MCP prompts, allow custom tool integration, and provide customizability options
* CLI and Development Tools: * The speaker previously maintained the CLI for Netlify * They use React Inc. for CLI development, translating React code to ANSI escape codes * Big fans of Bun for compiling code and faster test running
* Permission and Autonomy Considerations: * Developing a sophisticated permission system for AI agents * Key permission considerations include allowing/restricting file reading, controlling file editing, and permitting test runs * Approach to auto-acceptance involves early error identification and context-dependent autonomy * Potential risks in file writing include prompt injection attacks and risk of model taking incorrect actions
AI and Human Interaction
* Recognition that AI models are non-deterministic, so human involvement remains important: * A "meter paper" suggests the time between human inputs is doubling every 3-7 months * Anthropic is noted as performing well, being autonomous for about 50 minutes with minimal human effort
* Code Review and AI Tools: * Exploring AI's potential in code review and development processes * Interest in moving from rule-based to semantic linting * Challenges include managing large pull requests and identifying truly important review points
* Cloud Code and Non-Interactive Mode: * Cloud Code is described as a flexible "primitive" for building various tools * Non-interactive mode (using -p flag) is recommended for read-only tasks, generating changelogs, and scanning repositories * Best practices include setting specific permissions, limiting tool access, and focusing on read-only operations
Developer Workflow and Management Perspectives
* Recommendations for using AI code generation tools: * Start small when using tools like Quad Code * Test incrementally: start with one test, iterate on prompts, then gradually scale up * AI tools can pre-accept permissions in non-interactive modes
* Management and Organizational Perspectives: * CTOs and VPs are generally excited about AI code generation tools * Managers are concerned about how to manage widespread AI tool usage * Individual developers remain responsible for code quality, documentation, and maintainability
* Development Process Changes: * Shift from extensive design documentation to rapid prototyping * Ability to quickly test multiple versions of a feature * Lower cost and effort of implementation changes software development approach
Memory and Knowledge Management
* Different approaches to AI memory and knowledge storage: * Current methods include external stores like Chroma, using key-value or graph stores * One participant believes the AI model itself will eventually "subsume everything else" as models improve
* Key Perspectives on Memory: * Questioning whether current memory techniques are just workarounds for limited context length * Desire to have controllable, auditable AI knowledge systems * Interest in understanding what knowledge is actually stored in the model
* Anthropic's Approach: * Experimenting with memory techniques like logging agent actions * Goal is to develop generalized memory features that work across use cases * Early experiments included Retrieval-Augmented Generation (RAG) with code base indexing
Agentic Search and Thinking Approaches
* Agentic search was chosen over other search tools for three main reasons: * Significantly outperformed alternatives (based on internal "Vibes" benchmarks) * Avoided complex indexing challenges and potential security risks associated with traditional RAG methods
* Thinking and Planning in Quad: * Thinking is integrated into the workflow, not a separate rigid mode * Users can ask Quad to "think" or "make a plan" at any stage * Uses a chain of thought approach rather than a specific "Think Tool" * Flexible process: research, pull context, think, plan, potentially write code
* Parallel Investigation and Decision-Making: * Users can ask Cloud to investigate multiple solution paths in parallel using sub-agents * Cloud can research options and then select and summarize the best approach * The user can choose whether to let Cloud pick or review options themselves
Model Performance and Limitations
* Sonnet 3.7 is highly persistent and motivated to accomplish tasks but faces challenges: * Sometimes takes instructions too literally * Can struggle with understanding implied nuances of user requests * Example: Passing tests by hard-coding solutions instead of creating proper implementations
* Context retention is challenging: * Current models may lose original intent as conversations progress * Developers are working on expanding effective context windows
* Memory and Session Continuity: * Cloud Code currently reforms the entire state for each session * Limited between-session memory * Current recommendation: Manually save session state in a text document * Plan to develop more native ways to handle cross-session context retention
Future Development and Vision
* The team is thinking about model capabilities 3 months ahead, with key focus areas including: * Increased task autonomy * Better information exploration * More thorough task completion * Improved tool composition
* CloudCode has a dedicated, growing team with long-term commitment: * Currently using a Pay As You Go model, considering potential subscription options * Focusing on enterprise support through security and productivity discussions
* The project has been repeatedly rewritten (approximately every 3-4 weeks): * Rewrites tend to focus on simplification rather than added complexity * Maintaining consistent interface is a key consideration during iterations
* The team is currently not open-sourcing the project: * Reasons include small team size and the challenges of managing open-source contributions * Suggested alternative of "source available" approach
* Terminal and UX Design: * Designing for terminal interfaces is challenging, with limited existing literature * The team is developing a new design language to make terminal apps feel fresh, modern, and intuitive