View Code

Overview

TokenSmith is a retrieval-augmented QA pipeline for technical textbooks, designed to reduce “semantic miss” failures by combining sparse keyword matching with dense embedding retrieval, then re-ranking candidates with an ensemble ranker.

Key Achievements

  • Hybrid retrieval (dense + sparse) to improve coverage on exact terminology and equations-heavy text.
  • EnsembleRanker supports Reciprocal Rank Fusion (RRF) and weighted score fusion for configurable ranking behavior.
  • Stable evaluation harness with structured logging to debug retrieval failures and measure end-to-end answer quality.

Technical Implementation

  • Retrieval: FAISS for dense similarity + BM25 for keyword recall, merged via fusion strategies.
  • Ranking: Config-driven EnsembleRanker stage to combine multiple retrievers and optionally add lightweight feature-based scoring.
  • Inference: Pluggable LLM backend (local or hosted), with chunking and provenance attached to retrieved passages.
  • Inference: Support for llama.cpp inference and document chunking

Impact

This RAG pipeline enables efficient querying of large document collections, making educational content more accessible and searchable through natural language interfaces.

View on GitHub →