DIGITAL ARCHIVE TECH BLOG

Digital Archive Systems Tech Blog

Articles on Digital Archives, Digital Humanities, and Software Development

Latest Articles

View all →

Aligning IIIF 3D Viewer with Presentation API 4 — converting legacy manifests at runtime

I aligned IIIF 3D Viewer with the IIIF 3D Technical Specification Group's Presentation API 4 draft (Scene / PointSelector / WKTSelector / PerspectiveCamera). Existing manifests authored against IIIF Presentation 3 with a project-local 3DSelector / camPos extension are funnelled through a runtime converter so the rest of the pipeline only deals with v4. This post records the diff, the conversion rules, and the implementation outline.

iiif3dpresentation-apiweb-annotation

Auto-Updating a Statistics Page from TEI/XML Transcription Data via CI/CD — A Case Study from the Kouigenjimonogatari Text DB

How a statistics page that aggregates pages, lines, characters, and waka counts per chapter is generated from TEI/XML transcription data, and how the rebuild and redeployment are automated with GitHub Actions.

teixmldhgithub-actions

Probing the Public APIs of the Tohoku University Digital Archives — Exporting per-setSpec Excel via OAI-PMH

I surveyed which public APIs are available on the Tohoku University Digital Archives (touda.tohoku.ac.jp/collection) and worked out a procedure to harvest metadata per setSpec via OAI-PMH and write each set into its own Excel file.

oaipmhiiifpythonopenpyxl

📥researchmap Achievement Registration: Options for Individual Researchers and a Playwright Implementation

researchmap provides an official write API and CSV/JSON/JSONL import for individual users, each with its own constraints. This article reviews the available options and presents a Playwright implementation that adds full automation and PDF attachment support.

researchmapplaywrightpythonautomation

A YAML-Driven Next.js Admin Console — Multiple Sites, Multiple Actions

Notes on extending an admin console so that adding sites and actions requires editing one YAML file rather than touching code, with a tabbed UI for multiple actions per site.

nextjscloudflaregithub-appyaml

Building an Org-Wide Admin Console with GitHub App + Cloudflare Access

Notes on building an admin console where non-engineers can trigger deploys and data updates for multiple database sites without GitHub or Vercel accounts. Combines GitHub App authentication with Cloudflare Access (Zero Trust), and walks through how PAT, OAuth App, and GitHub App differ.

githubgithub-appcloudflarecloudflare-access

Why Drupal's Automatic Updates Wasn't Running: `Unattended background updates` Is Disabled by Default

I assumed having Drupal's Automatic Updates module installed meant security updates would just land. They weren't. The cron-time policy `Unattended background updates` ships disabled by default, so the module was effectively idle. This post records the diagnosis, the configuration that finally let 10.6.3 → 10.6.7 apply automatically, and the 'not officially supported' warning that surfaces once you turn it on.

drupalautomatic-updatessecuritycron

ElevenLabs v2 vs v3 for Japanese Tech Narration — A/B Comparison Using a Voice-Cloned Synthetic Voice

I ran an experiment narrating Japanese tech-blog articles with a voice-cloned synthetic voice trained on my own speech, using ElevenLabs Voice Cloning + the eleven_v3 model. This post records an A/B comparison of v2 and v3 on identical narration material, plus operational notes.

elevenlabsvoice-cloningpodcastyoutube

Mirador 4.0.0 hides supplementing annotations from the Annotations panel — a `filteredMotivations` gotcha

I delivered IIIF Presentation 3 OCR text annotations with `motivation: "supplementing"`, and they showed up in Annona and other viewers but not in Mirador 4.0.0's Annotations side panel. Reverse-engineering the deployed Mirador bundle revealed that the released default for `config.annotations.filteredMotivations` is `['oa:commenting', 'oa:tagging', 'sc:painting', 'commenting', 'tagging']` — `supplementing` isn't in the allowlist. This post walks through how I found that, the `['commenting', 'supplementing']` array workaround, and the relevant spec / Cookbook references.

iiifmiradorannotationpresentation-api

Building an Access-Controlled IIIF Digital Archive — Cantaloupe + S3 + Elasticsearch + Next.js, Gated by Cloudflare Access

An implementation log for a digital archive that delivers historical photographs which cannot be made fully public, while still preserving the benefits of IIIF (spec-compliant high-resolution viewer, manifest delivery) for an authorized membership. The stack is Cantaloupe (IIIF server) + S3-compatible storage + Elasticsearch (search) + Next.js (UI) + Cloudflare Tunnel + Access. We also lay out where IIIF Auth API 2.0 would fit in for cross-host interoperability.

iiifcantaloupeelasticsearchnextjs

📅Auto-filling Chouseisan attendance with Playwright, deciding answers via Claude Code's Google Calendar MCP

A small CLI that automates attendance responses on Chouseisan (chouseisan.com) using Playwright. The decision part — whether each candidate slot should be ◯/△/× given the user's Google Calendar — is delegated to Claude Code via the claude.ai Google Calendar MCP. The workflow is split into three independent stages (fetch / fill / submit), and decision rules live in CLAUDE.md.

playwrightclaude-codemcpgoogle-calendar

Comparing NDL Koten OCR-Lite and Cloud Vision API on a Jiaxing Tripitaka 'Mahāprajñāpāramitā Sūtra' — Observations across 105 Images

We applied two OCR engines — Japan's National Diet Library NDL Koten OCR-Lite and Cloud Vision API DOCUMENT_TEXT_DETECTION — to 105 IIIF images of fascicles 571–575 of the Mahāprajñāpāramitā Sūtra in the Jiaxing Tripitaka held by Yūrenja (formerly the Hōonzō of Zōjōji), and compared the patterns of error in their outputs. NDL produced phantom kana lines on 12 pages; Vision picked up color charts, rulers, and shelf labels as if they were body text on all 105.

ocrndl-koten-ocrgoogle-vision-apiiiif