Overview
- The vector database category is evolving rapidly, with many traditional databases now incorporating vector search capabilities, suggesting a market consolidation similar to what happened with NoSQL databases after MongoDB's emergence.
- Effective search and RAG systems require hybrid approaches rather than relying solely on embeddings—combining keyword matching (BM25), semantic search, and re-ranking layers produces superior results for most applications.
- When building search infrastructure, developers should start simple (clean data, use BM25) and progressively add complexity, while maintaining modular, separately scalable components rather than tightly coupling ML models with databases.
- Despite expanded context windows in LLMs, RAG remains essential for most applications, though implementation approaches vary based on scale—small datasets might fit in a single database while large-scale systems benefit from specialized vector search solutions.
- The future of embeddings points toward domain-specific models (legal, finance, health) and multimodal capabilities, with growing interest in visual language models as embedding backbones.
Content
Background and Context
- Joe Christian Burgum is from Trondheim, Norway with 20 years of experience in search infrastructure
- His background includes work at Yahoo and Fast Search and Transfer
- He was an early investor and contributor to Chroma, writing their example in the OpenAI cookbook
Vector Databases and Market Evolution
- Post-ChatGPT (November 2022), developers saw vector embeddings as the primary way to do search and RAG
- Pinecone was an early pioneer in positioning vector databases as a new infrastructure category
- The vector database "category" is now declining, though not necessarily the companies themselves
- Significant fundraising has occurred in vector databases (around $230 million)
- The market is likely unsustainable with too many players
- This pattern resembles MongoDB's NoSQL database emergence and subsequent market convergence
Search Technology Integration
- Vector search capabilities are now integrated into many database technologies
- Traditional databases like Postgres (PG vector), Elasticsearch, and Solr now offer vector search
- Burgum argues these technologies should be viewed more as "search engines" rather than a separate database category
- Companies like Turbopuffer are emerging with developer-focused approaches
- Search can involve multiple approaches: grep, semantic search, keyword search, web search
- Embeddings are important but not the only solution for search
- Embeddings went mainstream after OpenAI's APIs, previously limited to big tech companies
Embedding Insights and Applications
- Embeddings are valuable for representing multimodal data, not just for similarity search
- Effective search requires additional signals like freshness and authority
- Hybrid query approaches are recommended for most applications
- For smaller scale operations, using a single database system might be practical
- Large-scale applications may require specialized vector search solutions
Recommender Systems and Search Convergence
- Embedding-based retrieval is common in large-scale recommender systems like TikTok
- Typical large-scale systems use a cascade approach:
Recommended Approach for Building RAG Applications
- Start by cleaning and preparing data
- Use BM25 (keyword matching) as a strong baseline algorithm
- Progressively add complexity:
Technology Stack Considerations
- Potential progression:
- Key recommendations:
- Skeptical of tightly coupling machine learning models with databases
- Prefers keeping infrastructure components modular and separately scalable
Clarifications on RAG and Vector Databases
- RAG is NOT dead; augmenting AI with retrieval/search will remain relevant for a long time
- Some people are incorrectly claiming long context models eliminate the need for RAG
- Context windows have expanded dramatically (from 4K to 10 million tokens)
- Not all use cases require vector databases
- Some scenarios (like 300 articles) can now fit within a single model's context window
Knowledge Graphs and Graph Databases
- Building a knowledge graph is more challenging than using a graph database
- People often mistakenly link a concept directly to a specific technology
- Graph exploration can be done through various methods, not just graph databases
- Graph retrieval might be better than vector retrieval in some cases
- LLMs now make generating entity triplets easier, creating potential for improved knowledge graph development
Future Directions and Innovation
- Growing interest in domain-specific embedding models (legal, finance, health)
- Desire for visual language models as embedding model backbones
- Potential to use screenshot-based embeddings to avoid complex processing
- Embedding model business challenges:
- Notable companies/developments:
Personal Networking
- Speaker recommends connecting on Axe (social platform)
- Highlights Axe as a high-quality AI community platform
- Mentions also growing presence on LinkedIn and YouTube