NYC AI Engineer Summit focusing on AI Leadersh">

Latent Space: The AI Engineer Podcast

Beating Google at Search with Neural PageRank and $5M of H200s — with Will Bryk of Exa.ai

Overview

  • Exa AI (formerly Metaphor Systems) is building a fundamentally new search engine that uses neural networks to predict relevant documents rather than relying on traditional keyword matching, with the goal of creating "perfect search" that truly understands user queries.
  • The company's technology combines document prediction (training models to predict links based on surrounding text) with comprehensive infrastructure including URL discovery, web crawling, and AI processing—all designed to deliver more precise, high-quality results than conventional search engines.
  • Exa's approach allows for variable compute allocation to searches, enabling more complex semantic queries and comprehensive results gathering that would be impossible with traditional search methods, though this sometimes requires longer processing times.
  • The future of search likely involves LLM-based interfaces combined with powerful backend search capabilities, creating more collaborative "agentic" search experiences that balance automation with user involvement rather than full autonomy.
  • Exa faces significant challenges including web scraping difficulties, training data quality control, and managing computational costs, but believes rapidly decreasing LLM costs (potentially 200x reduction in coming years) will create new opportunities for innovative search technologies.

Content

Background and Origins

  • Will Brick is the CEO and co-founder of Exa AI (previously Metaphor Systems)
  • Has been interested in search and high-quality information since childhood
  • Built a mini search engine in college
  • Founded Exa with the goal of creating a fundamentally better search engine
  • Professional background includes:
- Interned at SpaceX (was a big Elon Musk fan) - Also worked at Zoox in robotics - Believed in self-driving technology, didn't learn to drive until recently

Early Company Vision and Evolution

  • Entered Y Combinator in summer 2021 with the pitch of "Google 2.0"
  • Inspired by GPT-3's language understanding capabilities
  • Aimed to create a search engine that truly understands user queries
  • Believed Google hadn't meaningfully improved in a decade
  • Always focused on building a better search algorithm
  • Started as a research endeavor with the potential to become a significant business
  • Initially released an early search engine as a "research preview"
  • Transitioned from a research to a product-focused company
  • Core mission: Create "perfect search" over the web, with downstream use cases to be determined later

Exa's Technology and Approach

  • Describes itself as the "open AI of search" - a research startup focused on fundamental AI research for search
  • Originated with a unique approach to link/document prediction using a transformer-inspired model
  • Initial training involved predicting links/documents based on surrounding text
  • Process involves:
- Finding links on the web - Taking text surrounding the link - Hiding the link - Training the model to predict the hidden link/document
  • More accurately described as "document prediction" rather than link prediction
  • Similar to language model training, with potential for further refinement through synthetic data and supervised fine-tuning

Exa's Current Components and Infrastructure

  • Comprehensive search engine built from scratch
  • Major subsystems include:
- URL discovery - Web crawling - AI document processing - High-throughput, low-latency serving system (vector database)
  • Relatively small teams compared to large tech companies like Google
  • Name "Exa" means 10 to the 18th power (contrasts with Google's name which references 10 to the 100th)
  • Core philosophy: Smaller, more precise results (10^18) are better than massive, unfocused result sets (10^100)

Key Features and Capabilities

  • Ability to create complex, semantic queries
  • Find comprehensive lists of results (e.g., "startups working on hardware in SF")
  • Handle semantic variations (robotics, wearables, hardware)
  • Search can take varying amounts of time (from milliseconds to hours)
  • Some search platforms offer previews and allow scaling of compute resources
  • Complex searches might require longer processing times
  • Fully neural-based search engine not relying on traditional keyword algorithms
  • End-to-end neural search methodology
  • "Link prediction objective" serves as a neural equivalent to PageRank
  • Can capture content references in multiple ways, making it more powerful than traditional search methods

Exa's Product Offerings

  • Search API
  • Excel search
  • List builder
  • Web scraping capabilities
  • Ability to retrieve full content for URLs, not just links
  • Can retrieve multiple URLs simultaneously

Challenges in Search Technology

  • Subjectivity of search results
  • Comprehensiveness of results
  • Semantic understanding
  • Training data quality is crucial - "if you train on a bunch of crap, your prediction will be crappy"
  • Aim to control training data to ensure high-quality content
  • Goal is to avoid "SEO slop" and low-quality search results that plague traditional search engines
  • Increasing difficulty accessing content due to sites blocking bots and scrapers
  • Potential solutions include data partnerships and leveraging long-tail open sites
  • Scraping is a challenging technical problem
  • Difficult to create a "perfect" scraper

Compute-based Approach to Search

  • More compute can be applied to increase result comprehensiveness
  • Analogous to O1's approach of applying variable computational resources to solve problems
  • Potential to use large language models like GPT-4 to scan and classify web content
  • Future considerations:
- Developing user-controllable compute budgets - Creating feedback loops for refining search results - Exploring the paradigm of "variable compute products"

Business and Knowledge Implications

  • Traditional information arbitrage is being disrupted by accessible search tools
  • Search infrastructure enables building applications and direct user interfaces
  • Potential to democratize access to information across industries
  • Distinction between "super knowledge" and "super intelligence"
  • Future AI systems may require robust search capabilities to overcome knowledge limitations
  • Even advanced AI (like potential AGI) will need search tools to access information

Potential Use Cases for Exa

1. Dating - Finding potential partners based on specific criteria - Matching intellectual compatibility - Searching profiles across the web

2. Academic/Research - Writing assistants for students - Searching and summarizing research papers - Helping with research paper preparation

3. Business/Investment - Venture capital research - Finding lists of companies in specific industries/sectors - Competitor analysis - Identifying potential sales targets

4. Recruiting - Searching for potential candidates - Finding professionals who have written about relevant topics - Discovering candidates through blogs, LinkedIn, Twitter, etc.

5. Enterprise/Company Document Search (future expansion)

Search Engine Evolution and Future Vision

  • Google has dominated search for 30 years, conditioning people to think of search in limited ways
  • ChatGPT has expanded people's understanding of what search can be
  • Future search interfaces will likely involve Large Language Models (LLMs)
  • Making oneself "discoverable" online is increasingly important
  • Search engines fundamentally shape what content gets created
  • Exa aims to optimize for high-quality, contextually relevant content, unlike keyword-based search engines
  • LLMs will likely become the primary search interface
  • Search engines should be designed to handle complex LLM-generated queries
  • The goal is to create more intelligent, context-aware search experiences

Agentic Search Concept

  • The search approach being developed is considered "agentic" - capable of taking actions and making decisions
  • Combines algorithmic and agent-based approaches
  • The goal is to create a search tool that feels collaborative, not completely autonomous
  • Full autonomy (Level 5) tends to fail because users want to be involved in the process
  • Users prefer "drive assist" models where they can influence and understand the search/research process
  • Current AI agents are not yet advanced enough to completely replace human involvement
  • As AI agents improve, the term "agentic" may become less meaningful because agent-like capabilities will become standard

AI Search and Interface Challenges

  • Exploring new search interfaces that are iterative and allow for refinement
  • Identifying potential failure modes in AI agents:
- Lack of intelligence - Incomplete understanding of user context - Communication gaps between humans and AI
  • Current system prompts often feel performative (e.g., "you are a helpful assistant")
  • Users need to be part of the process, not just give high-level commands
  • There's uncertainty about how to effectively guide AI behavior
  • Prompting techniques currently feel more like "cargo culting" than scientific approach

AI Training and Model Ecosystem

  • Discussion of creating self-training AI systems using reward signals
  • Possibility of AI generating its own training tasks and learning from performance
  • Belief that future AI models will be trained using this self-improvement paradigm
  • OpenAI won't likely dominate all language model use cases
  • Expect multiple models from different companies of varying sizes
  • Some use cases will prioritize inference speed over complex reasoning
  • Human labeling for search can be challenging and often keyword-based
  • Large Language Models (LLMs) may be more effective at data labeling
  • LLMs like GPT-4 can potentially improve search result relevance

Company Culture (Exa)

  • Founded by long-time friends with a "counter consensus" approach
  • Described as having an unconventional, meme-friendly culture
  • Culture of fun, laughter, and unconventional problem-solving
  • Implemented nap pods to address employee fatigue and promote creativity
  • Purchased nap pods from China, with a humorous story about the heavy delivery
  • Emphasis on employee well-being and providing flexible work environments
  • Rejection of "hustle culture" in favor of enjoying work and building meaningful things
  • Belief that building something from scratch with friends is a deeply satisfying experience
  • Currently hiring and growing rapidly

Technical Infrastructure and Economics

  • Purchased a $5 million H200 compute cluster
  • Use a mix of their own cluster and AWS for inference and training
  • Considering the economic constraints of AI search and inference costs
  • Managing computational costs is a key challenge
  • Strategically allocating compute resources
  • Pre-processing and indexing to reduce real-time computational expenses
  • Technical approach involves:
- Front-loading computation through periodic indexing - Limiting full LLM processing to smaller, manageable subsets of data - Using re-ranking techniques with different transformer model sizes
  • LLM costs are rapidly decreasing (potentially 200x reduction in a few years)
  • This cost reduction creates new opportunities for rethinking search algorithms
  • Suggests potential for more innovative and cost-effective search technologies

More from Latent Space: The AI Engineer Podcast

Explore all episode briefs from this podcast

View All Episodes →

Listen smarter with PodBrief

Get AI-powered briefs for all your favorite podcasts, plus a daily feed that keeps you informed.

Download on the App Store