Pinecone

Tech Stack for a RAG App That Searches Historical Documents with AI

Introduction I built a RAG (Retrieval-Augmented Generation) application that lets users ask natural language questions about a research project’s published reports (10 volumes) and receive answers with citations to the relevant source materials. This article covers the tech stack and key design decisions behind the app. Architecture Overview User ↓ Question Next.js (App Router) ↓ API Route Query Rewrite (LLM) ↓ Refined search query Embedding Generation (text-embedding-3-small) ↓ Vector Pinecone (Vector Search, topK=8) ↓ Relevant chunks LLM (Claude Sonnet) ← System prompt + Context ↓ SSE Streaming Display answer to user Frontend Next.js 16 + React 19 + TypeScript The app uses the App Router with a simple 3-page structure. ...

March 7, 2026 · 6 min · Nakamura