Vector Search vs Keyword Search: Why You Need Both
The differences between semantic vector search and traditional keyword search, why neither is enough alone, and how hybrid search with RRF delivers better results.
If you're building search for an AI system, you'll face this question early: vector search or keyword search? The tutorials show you one or the other. In production, you need both.
Here's why, with real examples of what each misses.
Keyword search: fast, exact, and blind to meaning
Traditional keyword search (BM25, full-text search, ts_rank_cd in PostgreSQL) finds documents that contain the words you searched for. It's fast, well-understood, and deployed everywhere.
What it's good at:
- Exact terms: product names, policy numbers, acronyms
- "Find documents containing HIPAA compliance"
- "Search for order #12345"
- Rare words that uniquely identify a document
What it misses:
- "What's the vacation policy?" won't match a document titled "PTO Guidelines and Time-Off Procedures" because the words don't overlap
- "How do I get reimbursed?" won't match "Expense Report Submission Process"
- Synonyms, paraphrases, and different phrasings of the same concept
Keyword search is literal. It matches characters, not meaning.
Vector search: understands meaning, but loses specifics
Vector search (semantic search) converts text into numerical vectors that capture meaning. Similar concepts produce similar vectors, regardless of wording.
What it's good at:
- "vacation policy" matches "PTO guidelines" because they mean the same thing
- "How do I get reimbursed?" matches "Expense Report Submission Process"
- Natural language questions that use different words than the source document
- Multilingual matching (if the embedding model supports it)
What it misses:
- Exact terms: "HIPAA" might match broadly with health-related content instead of the specific compliance document
- Acronyms and proper nouns: "Q3 2025 OKRs" might not match well because the model doesn't understand your company's acronym conventions
- Negation: "not eligible for overtime" and "eligible for overtime" produce similar vectors because the meaning is about overtime eligibility in both cases
- Short, precise queries: "Form W-9" might retrieve tax documents generally instead of the specific form
Vector search understands concepts but loses precision on specifics.
What happens when you use only one
Vector search only
A user asks: "What is policy HR-2024-007?"
Vector search returns documents about HR policies in general, ranked by semantic similarity to "HR policy." The specific policy number HR-2024-007 is just noise to the embedding model. The right document might be ranked 15th because another document about HR policies scored higher semantically.
Keyword search only
A user asks: "How do I request time off when I'm feeling burned out?"
Keyword search looks for "request", "time", "off", "feeling", "burned", "out." It might match a document about fire safety ("burned") or timeout settings ("time out") instead of the PTO request procedure. The phrase "burned out" has no keyword overlap with "mental health days" or "wellness leave."
Hybrid search: use both, combine with RRF
The answer is to run both searches in parallel and combine the results. The question is how.
The naive approach (score mixing):
combined = 0.7 * vector_score + 0.3 * keyword_score
This requires tuning the weights (0.7/0.3), and it breaks when the score distributions are different. Vector similarity is 0-1. Keyword rank scores are on a completely different scale. Adding them distorts both signals.
The better approach: Reciprocal Rank Fusion (RRF)
RRF combines ranked lists based on position, not score:
rrf_score(doc) = 1/(k + rank_vector) + 1/(k + rank_keyword)
Where k = 60 (from the original paper).
A document ranked #1 in vector search contributes 1/(60+1) = 0.0164. If it's also ranked #3 in keyword search, it gets an additional 1/(60+3) = 0.0159, for a total of 0.0323. A document appearing in only one list gets a lower combined score.
Why RRF works:
- No weights to tune
- No score normalization needed
- Works across any two (or more) ranking signals
- Documents appearing in both lists are naturally promoted
- Computation time: less than 1 millisecond
Real example
A user asks: "What is the HIPAA compliance requirement for visitor data?"
Vector search results (ranked by semantic similarity):
- "Data Privacy and Visitor Information Handling" (0.42)
- "General Compliance Overview" (0.39)
- "HIPAA Compliance Procedures" (0.37)
- "Employee Data Protection Policy" (0.35)
Keyword search results (ranked by term frequency):
- "HIPAA Compliance Procedures" (matches "HIPAA", "compliance")
- "Visitor Registration Requirements" (matches "visitor")
- "Data Privacy and Visitor Information Handling" (matches "visitor", "data")
After RRF fusion:
- "HIPAA Compliance Procedures" (appears in both: rank 3 + rank 1)
- "Data Privacy and Visitor Information Handling" (appears in both: rank 1 + rank 3)
- "General Compliance Overview" (vector only: rank 2)
- "Visitor Registration Requirements" (keyword only: rank 2)
The correct document ("HIPAA Compliance Procedures") wins because it appears in both lists. Vector search alone would have ranked it 3rd. Keyword search alone would have missed the visitor data connection. Together, they get it right.
Implementation
In PostgreSQL with pgvector, you can run both searches in parallel:
Vector search:
SELECT id, content, embedding <=> $query_vector AS distance
FROM rag_chunk
WHERE project_id = $project_id
ORDER BY embedding <=> $query_vector
LIMIT 12;
Keyword search:
SELECT id, content, ts_rank_cd(to_tsvector('english', content), query) AS rank
FROM rag_chunk, plainto_tsquery('english', $query_text) query
WHERE project_id = $project_id
AND to_tsvector('english', content) @@ query
ORDER BY rank DESC
LIMIT 12;
Fuse the results with RRF in application code, then pass the top candidates to a cross-encoder reranker for final scoring.
The full pipeline
In production, hybrid search is just the first stage:
- Vector search (12 candidates by semantic similarity)
- Keyword search (12 candidates by term matching, parallel)
- RRF fusion (merge into one ranked list)
- Cross-encoder reranking (neural model re-scores top candidates)
- Neighbor expansion (grab surrounding chunks for context)
- LLM generation (answer from the selected passages)
Each stage adds precision. Hybrid search gets the right documents in the candidate set. Reranking sorts them by actual relevance to the question. Neighbor expansion gives the LLM enough context to answer completely.
Summary
| Vector search | Keyword search | Hybrid (RRF) | |
|---|---|---|---|
| Semantic understanding | Yes | No | Yes |
| Exact term matching | No | Yes | Yes |
| Acronyms/proper nouns | Weak | Strong | Strong |
| Paraphrased questions | Strong | Weak | Strong |
| Weight tuning needed | N/A | N/A | No |
| Fusion overhead | N/A | N/A | <1ms |
Neither search is enough alone. Use both, fuse with RRF, and let the cross-encoder sort out the final ranking.
See it in action
Ask360 uses hybrid search with RRF fusion, cross-encoder reranking, and multi-signal confidence scoring. Try the live demos or build your own with the free tier.