Home Articles Books Search About
日本語
Building an Automated DH Tool Awareness System with Playwright, RSS, and AI

Building an Automated DH Tool Awareness System with Playwright, RSS, and AI

Why track DH tools In the Digital Humanities (DH) field, new tools are continuously developed and released. OCR engines for historical documents, IIIF viewers, text transcription platforms, and kuzushiji (classical Japanese cursive) recognition systems are just a few examples. In Japan, several organizations actively develop and publish such tools: NDL (National Diet Library of Japan) develops OCR tools for digitized materials. CODH (Center for Open Data in the Humanities, ROIS-DS) maintains kuzushiji recognition models and the IIIF Curation Platform. National Museum of Japanese History develops Minna de Honkoku (a crowdsourced transcription platform) and related IIIF tools. Keeping up with these releases manually is time-consuming. The goal was to build a system that systematically collects new DH tool releases and generates weekly summary articles, similar to a “current awareness” service. ...

Tech Stack for a RAG App That Searches Historical Documents with AI

Tech Stack for a RAG App That Searches Historical Documents with AI

Introduction I built a RAG (Retrieval-Augmented Generation) application that lets users ask natural language questions about a research project’s published reports (10 volumes) and receive answers with citations to the relevant source materials. This article covers the tech stack and key design decisions behind the app. Architecture Overview User ↓ Question Next.js (App Router) ↓ API Route Query Rewrite (LLM) ↓ Refined search query Embedding Generation (text-embedding-3-small) ↓ Vector Pinecone (Vector Search, topK=8) ↓ Relevant chunks LLM (Claude Sonnet) ← System prompt + Context ↓ SSE Streaming Display answer to user Frontend Next.js 16 + React 19 + TypeScript The app uses the App Router with a simple 3-page structure. ...