Probing the Public APIs of the Tohoku University Digital Archives — Exporting per-setSpec Excel via OAI-PMH

Tue, 05 May 2026 08:00:00 +0900

This article is co-authored with generative AI. While I have cross-checked facts against official documentation where possible, errors may remain. Please verify primary sources before making important decisions.

While poking around the Derge Tibetan Tripitaka database hosted on the Tohoku University Digital Archives (touda.tohoku.ac.jp/collection), I wondered whether there was any path that returned JSON, and ended up checking the available public APIs one by one. In the end OAI-PMH turned out to be the workable route, so this post records the procedure for harvesting per-setSpec into Excel files. The whole approach avoids HTML scraping.

Openpyxl on Digital Archive Systems Tech Blog

Probing the Public APIs of the Tohoku University Digital Archives — Exporting per-setSpec Excel via OAI-PMH