Home Articles Books Search About
日本語
Creating Project-Specific RNG Files Using Generative AI

Creating Project-Specific RNG Files Using Generative AI

Overview When editing TEI/XML files, changing the RNG file used for validation allows you to limit the tags and attributes available. This offers benefits such as preventing workers from being confused by tag choices and reducing inconsistencies in the created TEI/XML. As a method for editing RNG files, using Roma is common, as introduced in the following article. This is a top-down approach to limiting available tags and attributes, but this time we try creating an RNG file bottom-up from existing TEI/XML using generative AI. ...

Trying Out the Viewer from the "Pre-modern Japan-Asia Relations Digital Archive"

Trying Out the Viewer from the "Pre-modern Japan-Asia Relations Digital Archive"

Overview The “Pre-modern Japan-Asia Relations Digital Archive” was released on July 25, 2025. https://asia-da.lit.kyushu-u.ac.jp/ The viewer is also available at: https://github.com/localmedialabs/tei_comparative_viewer In this article, I share my experience trying out this viewer. As a result, I was able to self-host it as shown below: https://tei-comparative-viewer.aws.ldas.jp/ It loads the following XML file of “Kaitoshokokki” (Record of Countries and Peoples in the Eastern Sea): https://asia-da.lit.kyushu-u.ac.jp/viewer/300 Running Locally Detailed instructions are provided at the following link, which I followed to get it running: ...

Trying Odeuropa-Related Tools

Trying Odeuropa-Related Tools

Overview I had the opportunity to try tools related to Odeuropa, so this is a memo about that experience. What is Odeuropa? An explanation is available on the following page. https://odeuropa.eu/ Below is a machine-translated description. Odeuropa is an innovative EU-funded project that studies Europe’s “olfactory cultural heritage.” Project objectives: To investigate and document the role that smells have played in European culture from 1600 to 1920. Using cutting-edge AI technology, it extracts smell-related information from approximately 43,000 images and 167,000 historical texts (in English, Italian, French, Dutch, German, and Slovenian). ...

Trying DToC: Dynamic Table of Contexts

Trying DToC: Dynamic Table of Contexts

Overview I had an opportunity to try DToC: Dynamic Table of Contexts, so this is a memorandum. https://www.leaf-vre.org/docs/features/dtoc The machine-translated description is as follows: It brings innovation to electronic reading by combining the power of semantic markup with book navigation features. The traditional overview functions of printed books – the table of contents and keyword index – are dynamically integrated with full-text search and tag-based indexing features, creating a new reading experience. ...

Fixing the 'ref' Bug in DHConvalidator

Fixing the 'ref' Bug in DHConvalidator

This article was partially written by AI. Overview DHConvalidator is a tool for converting Digital Humanities (DH) conference abstracts into a consistent TEI (Text Encoding Initiative) text base. https://github.com/ADHO/dhconvalidator When using this tool, the following error occurred during the conversion process from Microsoft Word format (DOCX) to TEI XML format: ERROR: nu.xom.ParsingException: cvc-complex-type.2.4.a: Invalid content was found starting with element 'ref' This article shares the cause and solution for this issue. ...

Building an API Server for Searching the Koui Genji Monogatari Text DB

Building an API Server for Searching the Koui Genji Monogatari Text DB

Overview I built an API server for searching the Koui Genji Monogatari (Collated Tale of Genji) Text DB, so here are my notes. https://genji-api.aws.ldas.jp/ Background The following page publishes the text data of “Koui Genji Monogatari” in a TEI/XML-compliant format. https://kouigenjimonogatari.github.io/ This text data is registered in Elasticsearch to create an API that enables searching by text segments. Usage The usage documentation page using OpenAPI and Swagger is accessible at the following URL: ...

Creating TEI/XML Files from IIIF Manifest Files Using NDL Kotenseki OCR-Lite

Creating TEI/XML Files from IIIF Manifest Files Using NDL Kotenseki OCR-Lite

Overview This article introduces a Gradio app that creates TEI/XML files from IIIF manifest files using NDL Kotenseki OCR-Lite. It can be accessed at the following URL: https://nakamura196-ndlkotenocr-lite-iiif.hf.space/ Background This is a continuation of the following articles: Previously, two separate apps were needed, but with this update, the entire conversion process can be completed within a single Gradio app. Additionally, issues such as difficulty tracking progress when processing manifest files with many image pages, and the inability to copy processing results, have been fixed. ...

Part 2: Creating Annotated IIIF Manifest Files and TEI/XML Files Using NDL Classical Book OCR-Lite

Part 2: Creating Annotated IIIF Manifest Files and TEI/XML Files Using NDL Classical Book OCR-Lite

Overview In the following article, I introduced how to create annotated IIIF manifest files and TEI/XML files using NDL Classical Book OCR-Lite. Since the explanation above was insufficient in many areas, I will re-introduce how to use it. Supplement Along with writing this article, the following improvements were made. Process 1: Creating IIIF Manifest Files Added support for IIIF Presentation API v3. Process 2: Creating TEI/XML Files Added a form that accepts string input, considering the connection with Process 1. Usage Process 1: Creating IIIF Manifest Files Access the following. ...

DTS Viewer Update: Pagination Support

DTS Viewer Update: Pagination Support

Overview This is a note about adding pagination support to the DTS (Distributed Text Services) viewer. https://dts-viewer.vercel.app/ja/ Background When providing a large number of resources through DTS, it appears that the view property is used to present pagination information, as shown below. https://distributed-text-services.github.io/specifications/versions/unstable/#collection-endpoint { "@context": "https://distributed-text-services.github.io/specifications/context/1-alpha1.json", "dtsVersion": "1-alpha", "@id" : "lettres_de_poilus", "@type" : "Collection", "collection": "/api/dts/collection/{?id,page,nav}", "totalParents": 1, "totalChildren": 10000, "title": "Lettres de Poilus", "dublinCore": { "publisher": ["École Nationale des Chartes", "https://viaf.org/viaf/167874585"], "title": [ {"lang": "fr", "value" : "Lettres de Poilus"} ] }, "member": [ "..." ], "view": { "@id": "/api/dts/collection/?id=lettres_de_poilus&page=19", "@type": "Pagination", "first": "/api/dts/collection/?id=lettres_de_poilus&page=1", "previous": "/api/dts/collection/?id=lettres_de_poilus&page=18", "next": "/api/dts/collection/?id=lettres_de_poilus&page=20", "last": "/api/dts/collection/?id=lettres_de_poilus&page=500" } } I updated the DTS Viewer to support the view property described above. ...

Creating Annotated IIIF Manifest Files and TEI/XML Files Using NDL Klasseki OCR-Lite

Creating Annotated IIIF Manifest Files and TEI/XML Files Using NDL Klasseki OCR-Lite

Notice I have created a more accessible article explaining the workflow introduced in this article. Please also refer to the following. Overview I would like to introduce a prototype tool for creating annotated IIIF manifest files and TEI/XML files using NDL Klasseki OCR-Lite. Creating Annotated IIIF Manifest Files First, I created a Gradio app that takes an IIIF manifest file as input and outputs an annotated IIIF manifest file using NDL Klasseki OCR-Lite. It is published using Hugging Face Spaces. ...

Hosting TEI/XML Files on S3-Compatible Object Storage

Hosting TEI/XML Files on S3-Compatible Object Storage

Overview This is a memo about hosting TEI/XML files on S3-compatible object storage. Specifically, we target the mdx I object storage. https://mdx.jp/mdx1/p/about/system Background We are building a web application (Next.js) that loads TEI/XML files and visualizes their content. When the number and size of files were small, they were stored in the public folder, but as these grew larger, we considered hosting them elsewhere. There are many options for storage locations, but this time we target mdx I’s S3-compatible object storage. ...

Updating the DTS (Distributed Text Services) API for the Koui Genji Monogatari Text DB

Updating the DTS (Distributed Text Services) API for the Koui Genji Monogatari Text DB

Overview This is a memo about updating the DTS (Distributed Text Services) API for the Koui Genji Monogatari Text DB. Background The DTS (Distributed Text Services) API is described at the following link. https://distributed-text-services.github.io/specifications/ The following article introduced the creation of the DTS API. However, the following was noted as a remaining issue. Please note that the DTS API developed this time may have areas that do not comply with the above guidelines. ...

Improvements to the Polygon Annotation Support Tool for IIIF Images

Improvements to the Polygon Annotation Support Tool for IIIF Images

Overview I made improvements to “IIIF Annotator,” a polygon annotation support tool for IIIF images. Specifically, I worked on the following three points: Support for manifest files that do not use an Image Server Export function for IIIF manifest files with annotations Export function for TEI/XML files The following sections explain these improvements. Background The following article explained the reasons for creating a new annotation tool. The features added this time are also available in other tools, but were implemented for improved convenience. ...

Developing a DTS (Distributed Text Services) Viewer

Developing a DTS (Distributed Text Services) Viewer

Overview I developed a viewer for DTS (Distributed Text Services), so this is a memo of the process. You can try it at the following URL. https://dts-viewer.vercel.app/ja/ Background The official page for DTS (Distributed Text Services) is below. https://distributed-text-services.github.io/specifications/ I also covered this in the following article. This time, I developed a viewer that partially conforms to this DTS specification. Usage The following is the top page. Enter the DTS URL in the form. Examples are provided at the bottom of the page. Technically, it uses the Entry point. ...

Creating Polylines Using the Polygon Tool in Annotorious v2

Creating Polylines Using the Polygon Tool in Annotorious v2

Overview This is a memo on how to create polylines using the polygon tool in Annotorious v2. Background The Annotorious v2 website is available at the following link. https://annotorious.github.io/getting-started/ As shown below, polygons can be drawn. However, a tool for drawing polylines in a similar manner did not appear to be provided, including the following plugin. https://github.com/annotorious/annotorious-v2-selector-pack Customization When a polygon like the following is created: The following JSON file is generated: ...

Handling CORS for Express Deployed on Vercel Using vercel.json

Handling CORS for Express Deployed on Vercel Using vercel.json

Overview This is a memo on how to handle CORS for Express deployed on Vercel using vercel.json. Background I implemented CORS handling for the program introduced in the following article. The following was used as a reference. https://vercel.com/guides/how-to-enable-cors Method The solution is as follows. While there may be other approaches, adding headers resolved the issue. https://github.com/nakamura196/dts-typescript/commit/4c28f66b2af68950656dcb812f3e941d1b9b5feb { "version": 2, "builds": [ { "src": "src/index.ts", "use": "@vercel/node" } ], "rewrites": [ { "source": "/api/dts(.*)", "destination": "/src/index.ts" } ], "redirects": [ { "source": "/", "destination": "/api/dts", "permanent": true } ], "headers": [ { "source": "/api/(.*)", "headers": [ { "key": "Access-Control-Allow-Credentials", "value": "true" }, { "key": "Access-Control-Allow-Origin", "value": "*" }, { "key": "Access-Control-Allow-Methods", "value": "GET,OPTIONS,PATCH,DELETE,POST,PUT" }, { "key": "Access-Control-Allow-Headers", "value": "X-CSRF-Token, X-Requested-With, Accept, Accept-Version, Content-Length, Content-MD5, Content-Type, Date, X-Api-Version" } ] } ] } Summary I hope this serves as a helpful reference. ...

Prototyping a TEI/XML File Creation App Using Google Cloud Vision API and GakuNin RDM

Prototyping a TEI/XML File Creation App Using Google Cloud Vision API and GakuNin RDM

Overview I prototyped a TEI/XML file creation app using Google Cloud Vision API and GakuNin RDM, so this is a memo of that work. Background I needed an environment for creating TEI/XML files that reflect OCR results using Google Cloud Vision API. So I prototyped an environment that uses GakuNin RDM as the backend to manage files per user and execute OCR. How to Use Creating a Folder Access the following. ...

An Example of Representing IIIF Polygon Annotations in TEI/XML

An Example of Representing IIIF Polygon Annotations in TEI/XML

Overview This article introduces an example of representing IIIF polygon annotations in TEI/XML. Method In TEI/XML, you can represent polygon annotations using the zone tag and the points attribute. https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-teidata.point.html Example For verification purposes, I added a TEI/XML export feature to the annotation tool introduced in the following article. Specifically, the following download option was added. An example of the TEI/XML obtained as a download result is shown below. Rectangles are described using ulx, uly, lrx, lry, while polygon information is described using points. ...

Scrolling to a Specific Element Using CETEIcean and XPath

Scrolling to a Specific Element Using CETEIcean and XPath

Overview This is a memo on how to scroll to a specific element using CETEIcean and XPath. Demo You can try it at the following URL. https://next-ceteicean-router.vercel.app/xpath/ After accessing the page and scrolling, it is displayed as follows. Obtaining the XPath The above example targets the XML file from the “Koui Genji Monogatari Text DB.” https://kouigenjimonogatari.github.io/tei/01.xml The following XPath is specified. /TEI/text[1]/body[1]/p[1]/seg[267] To obtain this XPath, you can right-click on the target element in Oxygen XML Editor and select “Copy XPath.” ...

Prototyping a TEI/XML File Editing Environment Using LEAF Writer and GakuNin RDM

Prototyping a TEI/XML File Editing Environment Using LEAF Writer and GakuNin RDM

Overview This is a memo about prototyping a TEI/XML file editing environment using LEAF Writer and GakuNin RDM. References The following article introduced how to use LEAF Writer from Next.js. In particular, the following npm package is used. https://www.npmjs.com/package/@cwrc/leafwriter For the input/output of TEI/XML files to be edited above, GakuNin RDM is used. The following article may also be helpful regarding how to use the GakuNin RDM API from JavaScript. ...