Overview

The vector database category is evolving rapidly, with many traditional databases now incorporating vector search capabilities, suggesting a market consolidation similar to what happened with NoSQL databases after MongoDB's emergence.

Effective search and RAG systems require hybrid approaches rather than relying solely on embeddings—combining keyword matching (BM25), semantic search, and re-ranking layers produces superior results for most applications.

When building search infrastructure, developers should start simple (clean data, use BM25) and progressively add complexity, while maintaining modular, separately scalable components rather than tightly coupling ML models with databases.

Despite expanded context windows in LLMs, RAG remains essential for most applications, though implementation approaches vary based on scale—small datasets might fit in a single database while large-scale systems benefit from specialized vector search solutions.

The future of embeddings points toward domain-specific models (legal, finance, health) and multimodal capabilities, with growing interest in visual language models as embedding backbones.

Content

Background and Context

Joe Christian Burgum is from Trondheim, Norway with 20 years of experience in search infrastructure
His background includes work at Yahoo and Fast Search and Transfer
He was an early investor and contributor to Chroma, writing their example in the OpenAI cookbook

Vector Databases and Market Evolution

Post-ChatGPT (November 2022), developers saw vector embeddings as the primary way to do search and RAG
Pinecone was an early pioneer in positioning vector databases as a new infrastructure category

- Rapidly grew to around $100 million ARR - Recently repositioned to be more developer-focused - Shifted away from initial "memory for AI" enterprise messaging

The vector database "category" is now declining, though not necessarily the companies themselves
Significant fundraising has occurred in vector databases (around $230 million)
The market is likely unsustainable with too many players
This pattern resembles MongoDB's NoSQL database emergence and subsequent market convergence

Search Technology Integration

Vector search capabilities are now integrated into many database technologies
Traditional databases like Postgres (PG vector), Elasticsearch, and Solr now offer vector search
Burgum argues these technologies should be viewed more as "search engines" rather than a separate database category
Companies like Turbopuffer are emerging with developer-focused approaches
Search can involve multiple approaches: grep, semantic search, keyword search, web search
Embeddings are important but not the only solution for search
Embeddings went mainstream after OpenAI's APIs, previously limited to big tech companies

Embedding Insights and Applications

Embeddings are valuable for representing multimodal data, not just for similarity search
Effective search requires additional signals like freshness and authority
Hybrid query approaches are recommended for most applications
For smaller scale operations, using a single database system might be practical
Large-scale applications may require specialized vector search solutions

Recommender Systems and Search Convergence

Embedding-based retrieval is common in large-scale recommender systems like TikTok
Typical large-scale systems use a cascade approach:

- First, retrieve from candidate pool using embedding-based retrieval - Then use re-ranking layers to narrow down to final candidates

Recommended Approach for Building RAG Applications

Start by cleaning and preparing data
Use BM25 (keyword matching) as a strong baseline algorithm
Progressively add complexity:

- Use off-the-shelf embedding models - Leverage hybrid search capabilities - Consider adding re-ranking layer if budget and latency allow

Technology Stack Considerations

Potential progression:

- Transactional database (Postgres/MongoDB) - Vector store - Elasticsearch/Vespa/Redis - Add recommendation system

Key recommendations:

- Prefer offline/batch processing when possible - Be cautious about adding external API dependencies at high query volumes - Keep infrastructure components somewhat separate due to different scaling properties

Skeptical of tightly coupling machine learning models with databases
Prefers keeping infrastructure components modular and separately scalable

Clarifications on RAG and Vector Databases

RAG is NOT dead; augmenting AI with retrieval/search will remain relevant for a long time
Some people are incorrectly claiming long context models eliminate the need for RAG
Context windows have expanded dramatically (from 4K to 10 million tokens)
Not all use cases require vector databases
Some scenarios (like 300 articles) can now fit within a single model's context window

Knowledge Graphs and Graph Databases

Building a knowledge graph is more challenging than using a graph database
People often mistakenly link a concept directly to a specific technology
Graph exploration can be done through various methods, not just graph databases
Graph retrieval might be better than vector retrieval in some cases
LLMs now make generating entity triplets easier, creating potential for improved knowledge graph development

Future Directions and Innovation

Growing interest in domain-specific embedding models (legal, finance, health)
Desire for visual language models as embedding model backbones
Potential to use screenshot-based embeddings to avoid complex processing
Embedding model business challenges:

- API service requirements - Compute costs - Uncertain customer willingness to pay

Notable companies/developments:

- Voyage (acquired by NVIDIA) - GINA (European RAG startup) - Focus on specialized, domain-specific embedding approaches

Personal Networking

Speaker recommends connecting on Axe (social platform)
Highlights Axe as a high-quality AI community platform
Mentions also growing presence on LinkedIn and YouTube

⚡️The Rise and Fall of the Vector DB Category