Home Articles Books Search About
日本語
Trying Out TEIGarage

Trying Out TEIGarage

Overview TEIGarage is described as follows. https://github.com/TEIC/TEIGarage/ TEIGarage is a webservice and RESTful service to transform, convert and validate various formats, focussing on the TEI format. TEIGarage is based on the proven OxGarage. Trying It Out You can try it out on the following page. https://teigarage.tei-c.org/ We will use the “TEI Minimal” ODD file published at the following URL. This file is also used as one of the presets in Roma. ...

(Machine Translation) The TEI Archive

(Machine Translation) The TEI Archive

The following is a machine translation of “The TEI Archive” page. https://tei-c.org/Vault/ Text Encoding Initiative (TEI) The TEI Archive Table of Contents Poughkeepsie Principles Sponsoring Organizations 1. TEI Committee Documents 1987-1998 TEI Advisory Committee Analysis and Interpretation Committee Edited Papers Metalanguage and Syntax Issues Committee Steering Committee Technical Review Committee Text Documentation Committee Text Representation Committee 2. Previous Versions of the Guidelines 3. Unnumbered Reports, Articles, Presentations, etc. 4. Songs, Photos, and Other Ephemera TEI Tite Documents Workgroups That Have Completed Their Work Preliminary Drafts of Electronic Text Editing (MLA, 2006) All Available P5 Releases This page contains archival materials from the Text Encoding Initiative. Spanning the first ten years from the Poughkeepsie Conference of 1988 to the beginning of the process of establishing the TEI Consortium in 1999, these materials were collected from fragments across various servers and personal collections, though much of it derives from the excellent Listserv archive maintained by Wendy Plotkin in Chicago. ...

TEI/XML Visualization Example: Map Display Using Leaflet

TEI/XML Visualization Example: Map Display Using Leaflet

Overview For visualizing TEI/XML files, I created a repository that publishes visualization examples and source code. https://github.com/nakamura196/tei_visualize_demo You can see the visualization examples on the following page. https://nakamura196.github.io/tei_visualize_demo/ This time, I added an example of marker display using MarkerCluster, which I’ll introduce here. Prerequisites This assumes that you can already display markers using Leaflet (without using MarkerCluster). If you haven’t done so yet, please refer to the following visualization example and source code. ...

Created a Simple TEI/XML File Viewer Using Next.js

Created a Simple TEI/XML File Viewer Using Next.js

Overview I created a simple viewer that displays the contents of TEI/XML files. https://github.com/utda/tei-viewer Here is a display example targeting TEI/XML of the Koui Genji Monogatari: https://utda.github.io/tei-viewer/?u=https://kouigenjimonogatari.github.io/tei/01.xml&v=true Usage As a minimum feature, when a IIIF manifest file is associated, the Mirador viewer is displayed. The association method is based on the following format: https://github.com/TEI-EAJ/jp_guidelines/wiki/IIIF画像とのリンク Additionally, when the n attribute is given to the pb tag, a page number display feature is provided. Furthermore, for Japanese language support, when v=true is given as a query parameter, vertical text is displayed. ...

Aligning the Collated Tale of Genji with Modern Japanese Translations in Digital Genji Monogatari

Aligning the Collated Tale of Genji with Modern Japanese Translations in Digital Genji Monogatari

Overview “Digital Genji Monogatari” is a site that aims to propose an environment to support research on The Tale of Genji as well as education and research activities using classical texts, by collecting and creating various related data about The Tale of Genji and linking them together. https://genji.dl.itc.u-tokyo.ac.jp/ One of the features provided by this site is the “alignment of the Collated Tale of Genji with modern Japanese translations.” As shown below, the corresponding sections between the “Collated Tale of Genji” and Yosano Akiko’s translation published on Aozora Bunko are highlighted. ...

Usage Example of the Image Map Editor in Oxygen XML Editor

Usage Example of the Image Map Editor in Oxygen XML Editor

Overview This is an explanation of how to use the Image Map Editor in Oxygen XML Editor. Video https://youtu.be/9dZQ1v0Rky0?si=8EhAZdVsLqgPz2Rf Usage Prepare a TEI/XML file like the following. The url value of <graphic> can specify a relative path from the file, an absolute path on your PC, or a URL published on the internet. In the following example, the file digidepo_3437686_pn_null_9c48d89b-e2ec-4593-8d00-6fbc1d29d1bd.jpg stored in the same folder as the TEI/XML file is referenced. ...

TEI Publisher: Visualization Examples from the TEI Publisher Demo Collection (Part 1)

TEI Publisher: Visualization Examples from the TEI Publisher Demo Collection (Part 1)

Overview The following page on TEI Publisher showcases various visualization examples. https://teipublisher.com/exist/apps/tei-publisher/index.html?query=&collection=test&sort=title&field=text&start=1 In this and subsequent articles, I will introduce the above visualization examples. Letter #6 from Robert Graves to William Graves (at Oundle School) November 15, 1957 Overview https://teipublisher.com/exist/apps/tei-publisher/test/graves6.xml As shown below, the text is displayed alongside a list of place names and person names, as well as a map. It is described as follows: A 20th century manuscript letter from Robert Graves where emphasis has been put on visualizing rich encoding of semantic information in the letter, in particular geographic and prosopographical data. The map is displayed with a pb-leaflet component. ...

Formatting and Syntax Highlighting XML in Nuxt3

Formatting and Syntax Highlighting XML in Nuxt3

Overview As shown in the following image, I had the opportunity to display XML text data using Nuxt3, so this is a memo. Installation I used the following two libraries. npm i xml-formatter npm i highlight.js Usage I created the following file as a Nuxt3 component. It formats XML strings with xml-formatter and then applies syntax highlighting with highlight.js. <script setup lang="ts"> import hljs from "highlight.js"; import "highlight.js/styles/xcode.css"; import formatter from "xml-formatter"; interface PropType { xml: string; } const props = withDefaults(defineProps<PropType>(), { xml: "", }); const formattedXML = ref<string>(""); onMounted(() => { // `highlightAuto` 関数が非同期でない場合は、 // `formattedXML` を直接アップデートできます。 // そうでない場合は、適切な非同期処理を行ってください。 formattedXML.value = hljs.highlightAuto(formatXML(props.xml)).value; }); const formatXML = (xmlstring: string) => { return formatter(xmlstring, { indentation: " ", filter: (node) => node.type !== "Comment", }); }; </script> <template> <pre class="pa-4" v-html="formattedXML"></pre> </template> <style> pre { /* 以下のスタイルは適切で、pre要素内のテキストの折り返しを制御しています。 */ white-space: pre-wrap; /* CSS 3 */ white-space: -moz-pre-wrap; /* Mozilla, 1999年から2002年までに対応 */ white-space: -pre-wrap; /* Opera 4-6 */ white-space: -o-pre-wrap; /* Opera 7 */ word-wrap: break-word; /* Internet Explorer 5.5+ */ } </style> Summary I hope this is helpful for visualizing TEI/XML data. ...

Schemas Convertible from TEI ODD: RNG, XSD, DTD, and More

Schemas Convertible from TEI ODD: RNG, XSD, DTD, and More

Overview In the following article, I tried creating an ODD. The above uses a tool called Roma, and you can see that the created ODD has the following output formats available. Specifically, the available formats are “RELAX NG Schema,” “RELAX NG Compact,” “W3C Schema,” “Document Type Definition,” and “ISO Schematron Constraints.” I asked GPT-4 about the differences between these formats and am sharing the results here. There may be some inaccuracies, but I hope this serves as a useful reference. ...

Using Roma to Limit Tags for Your Project and Generate Documentation

Using Roma to Limit Tags for Your Project and Generate Documentation

Overview I previously explained how to use Roma in the following article. This time, I will explain the workflow for creating TEI ODD (One Document Does-it-all) and documentation (HTML and PDF) targeting TEI/XML files at hand. Note that at the end of this article, I have included GPT-4’s response regarding the differences between ODD (One Document Does it all) and RNG (RelaxNG). Please refer to that as well. Obtaining a List of Tags Used First, obtain a list of tags used in your project. ...

Using Versioning Machine (VM5.0) with Visual Studio Code (VSCode)

Using Versioning Machine (VM5.0) with Visual Studio Code (VSCode)

Overview Versioning Machine (VM5.0) is an application for visualizing textual variant information. http://v-machine.org/ This article explains how to use Visual Studio Code (VSCode) to display your own TEI/XML files in this application. The target TEI/XML files contain variant information described using the <listWit> tag, as shown below: <TEI xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> <fileDesc> <titleStmt> ... </titleStmt> <publicationStmt> ... </publicationStmt> <sourceDesc> <listWit> <witness xml:id="WA"> <title xml:lang="ja">ヴァイマル版ゲーテ全集(略称WA)</title> <title xml:lang="de">Goethes Werke. herausgegeben im Auftrage der Großherzogin Sophie von Sachsen</title> </witness> <witness xml:id="UTL"> <title xml:lang="ja">東京大学総合図書館所蔵のゲーテ自署付書簡</title> <title xml:lang="de">Der Brief von Goethe an Ludwig Wilhelm Cramer vom 29. Dezember 1822 im Besitz der Universitätsbibliothek Tokio</title> </witness> </listWit> <msDesc sameAs="#UTL"> ... As described later, this article uses text data from a letter with Goethe’s autograph held in the University of Tokyo General Library, which is publicly available at the following link: ...

I Created a Sample Repository Using CETEIcean and Nuxt 3

I Created a Sample Repository Using CETEIcean and Nuxt 3

Overview I created a sample repository using CETEIcean and Nuxt 3. https://github.com/TEIC/CETEIcean I referenced the following issue. https://github.com/TEIC/CETEIcean/issues/27 The script introduced there did not work with CETEIcean v1.8.0, so I created a minimal repository that works with CETEIcean v1.8.0 and Nuxt 3. Demo Page https://nakamura196.github.io/ceteicean-nuxt3 Source Code https://github.com/nakamura196/ceteicean-nuxt3 Main File https://github.com/nakamura196/ceteicean-nuxt3/blob/main/app.vue Summary I hope this serves as a useful reference. I would also like to express my gratitude to those who developed CETEIcean. ...

Converting TEI XML to LaTeX Using TEI Critical Apparatus Toolbox

Converting TEI XML to LaTeX Using TEI Critical Apparatus Toolbox

Overview TEI Critical Apparatus Toolbox is “a tool for people preparing a natively digital TEI critical edition.” http://teicat.huma-num.fr/index.php In addition to providing functionality for visualizing critical apparatus information, it offers several other useful features. Among these, I learned that it has a “TEI to LaTeX and PDF conversion” feature, so I decided to try it out. Print an edition Access the following URL. http://teicat.huma-num.fr/print.php Click the link with the text this dummy edition file to download the following sample data. ...

How to Extract respStmt name Values from TEI/XML Files (Explained by GPT-4)

How to Extract respStmt name Values from TEI/XML Files (Explained by GPT-4)

How to Extract respStmt name Values from TEI/XML Files: Approaches Using BeautifulSoup and ElementTree in Python This article introduces how to extract respStmt name values from TEI/XML files using Python’s BeautifulSoup and ElementTree. Method 1: Using ElementTree First, we extract the respStmt name value using Python’s standard library xml.etree.ElementTree. import xml.etree.ElementTree as ET # Load the XML file tree = ET.parse('your_file.xml') root = tree.getroot() # Define the namespace ns = {'tei': 'http://www.tei-c.org/ns/1.0'} # Extract the respStmt name value name = root.find('.//tei:respStmt/tei:name', ns) # Display the name text if name is not None: print(name.text) else: print("The name tag was not found.") Method 2: Using BeautifulSoup Next, we extract the respStmt name value using BeautifulSoup. First, make sure the beautifulsoup4 and lxml libraries are installed. If they are not installed, you can install them with the following command. ...

Created a Program to Calculate Edit Distance for TEI/XML Files Containing app Elements

Created a Program to Calculate Edit Distance for TEI/XML Files Containing app Elements

Overview I created a program to calculate edit distance for TEI/XML files containing app elements. You can use it from the following Google Colab notebook: https://colab.research.google.com/github/nakamura196/ndl_ocr/blob/main/編集距離を算出するプログラム.ipynb Upload an XML file and the program will calculate the similarity between witnesses. Example Let’s upload the following XML file: https://tei-eaj.github.io/koui/data/nakamura.xml The result is an Excel file like the following, which provides an overview of the similarity between witnesses. index name1 name2 distance ratio 0 中村式五十音 中村式五十音又様 10 0.85 1 中村式五十音 中村式五十音欠損本 7 0.8947368421052632 2 中村式五十音又様 中村式五十音欠損本 8 0.868421052631579 The following library is used for calculating similarity: ...

Collaborative Editing of TEI/XML Files Using Visual Studio Live Share (Not Limited to XML)

Collaborative Editing of TEI/XML Files Using Visual Studio Live Share (Not Limited to XML)

Overview Visual Studio Live Share is a VSCode extension that enables real-time collaborative development. https://visualstudio.microsoft.com/ja/services/live-share/ This time, we will try real-time collaborative editing of TEI/XML files using this extension. Demo Video A video of the collaborative editing was recorded. https://youtu.be/DzyuJAtzl90 The right side of the screen shows a user (nakamura196) using VSCode in a local environment, while the left side shows a user (Guest User) invited via Visual Studio Live Share editing using the online VSCode (vscode.dev). ...

Trying the jingtrang Library for RELAX NG Schema: Validation

Trying the jingtrang Library for RELAX NG Schema: Validation

Overview I had an opportunity to create an XML file conforming to a specific schema, and needed to verify that the XML file matched the schema. To meet this requirement, I tried the jingtrang library for working with RELAX NG schemas, so here are my notes: https://pypi.org/project/jingtrang/ I also prepared a Google Colab notebook: https://colab.research.google.com/github/nakamura196/ndl_ocr/blob/main/jingtrangを試す.ipynb Trying Validation # ライブラリのインストール pip install jingtrang # rngファイルのダウンロード(tei_allを使用) wget https://raw.githubusercontent.com/nakamura196/test2021/main/tei_all.rng # validation対象のXMLファイルの用意(校異源氏物語テキストのダウンロード) wget https://kouigenjimonogatari.github.io/tei/01.xml Passing Example Running the following produced no output: ...

Converting Word to TEI/XML

Converting Word to TEI/XML

Overview I had an opportunity to convert Word files to TEI/XML files. Upon investigation, in addition to official TEI tools such as TEIGarage Conversion, I found a conversion example in TEI Publisher: https://teipublisher.com/exist/apps/tei-publisher/test/test.docx.xml The above example appeared to convert Word style information into TEI tags, so I tried this approach. For this project, I used the python-docx library with the goal of using it independently of TEI Publisher. Word File I created a prototype Word file like the one below. All styles are provisional, but I created styles such as “tei:persName” and “tei:warichu” and changed their visual styling such as color. The mechanism works by applying styles to perform simple structuring. ...

Creating a Customized RNG File Using Roma: Restricting Available TEI Tags

Creating a Customized RNG File Using Roma: Restricting Available TEI Tags

Overview In this article, I will attempt to customize TEI ODD (One Document Does-it-all) using a web application called Roma. https://romabeta.tei-c.org/ For more about TEI ODD, please refer to the official site below. I must admit that I do not fully understand it myself due to limited study. https://wiki.tei-c.org/index.php/ODD However, one use case is that in TEI-based projects, you can restrict the tags used (specifically, those that receive assistance and validation). ...

An Example Workflow for Creating TEI/XML from Excel

An Example Workflow for Creating TEI/XML from Excel

Overview I created an example workflow for generating TEI/XML from data prepared in Excel. The following TEI/XML file is output. It supports page breaks using the pb tag, line IDs using the lb tag, multiple representations using choice/orig/reg tags, annotations using the note tag, and linking with IIIF images. <?xml version="1.0" encoding="utf-8"?> <TEI xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> <fileDesc> <titleStmt> <title/> </titleStmt> <publicationStmt> <ab/> </publicationStmt> <sourceDesc> <ab/> </sourceDesc> </fileDesc> </teiHeader> <text> <body> <pb corresp="#page_22"/> <ab> <lb xml:id="page_22-b-1"/> <seg> いつれの御時にか女御更衣あまたさふらひ <choice> <orig> 給ける <note corresp="#page_22-b-1-20" type="校異"> 給けるーたまふ河 </note> </orig> <reg> たまふ </reg> </choice> なかにいとやむことなきゝは </seg> </ab> </body> </text> <facsimile source="https://dl.ndl.go.jp/api/iiif/3437686/manifest.json"> <surface source="https://dl.ndl.go.jp/api/iiif/3437686/canvas/22" xml:id="page_22"> <label> [22] </label> <zone lrx="1126" lry="1319" ulx="1044" uly="895" xml:id="page_22-b-1-20"/> </surface> <surface source="https://dl.ndl.go.jp/api/iiif/3437686/canvas/23" xml:id="page_23"> <label> [23] </label> </surface> </facsimile> </TEI> An example of visualizing the above TEI/XML data is shown below. The image, text (original), text (regularization), and annotations are displayed on the same screen. ...