Overview
TokenSmith is a retrieval-augmented QA pipeline for technical textbooks, designed to reduce “semantic miss” failures by combining sparse keyword matching with dense embedding retrieval, then re-ranking candidates with an ensemble ranker.
Key Achievements
- Hybrid retrieval (dense + sparse) to improve coverage on exact terminology and equations-heavy text.
- EnsembleRanker supports Reciprocal Rank Fusion (RRF) and weighted score fusion for configurable ranking behavior.
- Stable evaluation harness with structured logging to debug retrieval failures and measure end-to-end answer quality.
Technical Implementation
- Retrieval: FAISS for dense similarity + BM25 for keyword recall, merged via fusion strategies.
- Ranking: Config-driven EnsembleRanker stage to combine multiple retrievers and optionally add lightweight feature-based scoring.
- Inference: Pluggable LLM backend (local or hosted), with chunking and provenance attached to retrieved passages.
- Inference: Support for llama.cpp inference and document chunking
Impact
This RAG pipeline enables efficient querying of large document collections, making educational content more accessible and searchable through natural language interfaces.