Overview

I had the opportunity to try rico-converter, so here are my notes.

https://github.com/ArchivesNationalesFR/rico-converter

It is described as follows.

A tool to convert EAC-CPF and EAD 2002 XML files to RDF datasets conforming to Records in Contexts Ontology (RiC-O)

Conversion

Instructions are available at the following link.

https://archivesnationalesfr.github.io/rico-converter/en/GettingStarted.html

First, download the latest zip file from the following link and extract it.

https://github.com/ArchivesNationalesFR/rico-converter/releases/latest

Sample data includes input-eac and input-ead, which we will convert to RDF.

input-eac

The following is a ChatGPT-generated explanation of input-eac/FRAN_NP_051151.xml in this folder.

!

This XML file is written in EAC-CPF (Encoded Archival Context for Corporate Bodies, Persons, and Families) format and systematically organizes information about organizations such as the French Ministry of Culture. The main elements are as follows:

  • Control section: Contains metadata about the record, including the record ID, language declarations, update history, and sources of materials used.
  • Identity section: Contains basic information about the corporate body (here, the French Ministry of Culture). Multiple names showing how the ministry’s name has changed over time, along with the periods during which each name was used, are described in detail.
  • Description section: Contains detailed explanations of the ministry’s period of existence, legal status, main functions, missions, and historical changes. For example, it describes how the ministry was established in 1959 and how it has been operated based on decrees, as well as how the organizational structure has changed.
  • Relations section: Shows relationships with other organizations and individuals. This includes detailed descriptions of relationships with related institutions in France, educational institutions, and successive Ministers of Culture, along with links to external resources (archives and online information).

This XML is intended to describe information about archives and organizations in a standardized format for integration with other databases and systems. It is particularly useful for tracking the history and relationships of organizations.

Execute the following.

sh ricoconverter.sh

Running with default settings creates a folder like output-eac-20241005 containing the output RDF files.

input-ead

The following is a ChatGPT-generated explanation of input-ead/FRAN_IR_003500.xml in this folder.

!

This XML file is written in EAD (Encoded Archival Description) format, which describes archival materials. This specific document describes an archival collection called “Fonds Vitet” (19th to 20th century) held at the French National Archives. The content includes an overview of the documents, creators, publication information, and a detailed structure of the archival materials.

Main elements:

  1. eadheader: Contains metadata about the materials (language, dates, repository information). For example, the material ID (FRAN_IR_003500) and an overview of the archive are documented.
  2. filedesc: Describes the title, creator, publisher, etc. of the materials. In this example, the title is “Fonds Vitet,” the creator is C. Sibille, and it was published by the French National Archives (Archives Nationales).
  3. archdesc: Describes the archive contents in detail. The unit IDs, titles, dates of creation, and provenance of each item in “Fonds Vitet” are structured and described.
  4. dsc: A detailed inventory of the archive. Individual documents, letters, photo albums, and other items are organized hierarchically with detailed contents listed.

This document is a detailed archival record, organized hierarchically so that researchers and archive users can easily reference the materials.

Similarly, execute the following.

sh ricoconverter.sh

At this point, when the Enter command to execute prompt appears, select convert_ead.

An example execution is shown below.

sh ricoconverter.sh
:: Welcome to Ric-O Converter 2.0.2 ::
Enter command to execute (convert_eac, convert_eac_raw, convert_ead, test_eac, test_ead, version, help) [press Enter for 'convert_eac'] :convert_ead
Enter parameter file location [press Enter for 'parameters/convert_ead.properties']:
java -Xmx1200M -Xms1200M -jar ricoconverter-cli-2.0.2-onejar.jar convert_ead @parameters/convert_ead.properties
RiC-O Converter 100% [===============================================] 17/17 (0:00:01 / 0:00:00)
07:29:49.862 INFO  f.g.c.a.r.e.c.Ead2RicoConverterReportListener -

--- EAD Conversion Report ---
- Number of files to process: 17
- Number of files in ERROR  : 0
- Number of files in success: 17

List of files in errors:
  None !

Process took 0:00:01 (started at 2024-10-05 07:29:48, ended at 2024-10-05 07:29:49)

As a result, a folder like output-ead-20241005 is created containing the output RDF files.

Registering in a SPARQL Endpoint

Following the instructions in the article below, the output RDF files are batch-registered in Dydra.

An example registration script is as follows.

from glob import glob
from dydra_py.api import DydraClient
from tqdm import tqdm

files = glob("./ricoconverter-2.0.2/output-eac-20241005/**/*.rdf", recursive=True)

endpoint, api_key = DydraClient.load_env("./.env")
client = DydraClient(endpoint, api_key)

# client.clear()

for file in tqdm(files):
    client.import_by_file(file, "xml")

Additionally, since RiC-O uses the following namespaces, I registered them in Snorql.

Let’s try various queries below.

https://nakamura196.github.io/snorql_examples/rico/?describe=http%3A%2F%2Fdata.archives-nationales.culture.gouv.fr%2FrecordResource%2Ftop-003500

!

This is a description of the “Fonds Vitet” archival collection stored at the French National Archives. This resource includes the following information:

  • rdf:type: This resource corresponds to “RecordResource” and “RecordSet.”
  • rdfs:label: The title of this collection is “Fonds Vitet.”
  • beginningDate: The collection’s start date is January 1, 1801.
  • date: The collection spans from the 19th to the 20th century.
  • endDate: The collection’s end date is December 31, 2000.
  • hasInstantiation: This collection has a concrete instantiation (actual materials) with the URI http://data.archives-nationales.culture.gouv.fr/instantiation/top-003500-i1.
  • hasOrHadHolder: The holder of this collection is the French National Archives institution (agent/005061).
  • hasProvenance: The provenance of this collection is related to two agents (050218 and 052986).
  • hasRecordSetType: The type of this collection is “Fonds.”
  • includesOrIncluded: This collection contains multiple record resources (003500-d_1, 003500-d_2, 003500-d_3, 003500-d_4).
  • isOrWasDescribedBy: This collection is described by a record (record/003500).
  • title: The official title of the collection is “Fonds Vitet.”

This data describes the historical record group “Fonds Vitet” stored at the French National Archives, containing materials from the 19th to the 20th century.

Multiple resources are linked via rico:includesOrIncluded.

https://nakamura196.github.io/snorql_examples/rico/?describe=http%3A%2F%2Fdata.archives-nationales.culture.gouv.fr%2FrecordResource%2F003500-d_1

!

This data is a description of “LUDOVIC VITET (1802-1873)” stored at the French National Archives. This resource represents a record resource related to Ludovic Vitet’s materials and includes the following information:

This information provides details about the collection of materials related to Ludovic Vitet, a 19th-century French historian and politician.

This is linked to an agent via rico:hasProvenance.

https://nakamura196.github.io/snorql_examples/rico/?describe=http%3A%2F%2Fdata.archives-nationales.culture.gouv.fr%2Fagent%2F051234

!

This data is a record of the agent (person) “Ludovic Vitet (1802-1873)” stored at the French National Archives. Ludovic Vitet was known as a French historian, archaeologist, and politician, and this record includes the following information:

  • rdf:type: This agent is classified as “Person.”
  • rdfs:label: The title is “Vitet, Ludovic (1802-1873).”
  • owl:sameAs: Links to other data sources about Ludovic Vitet are provided (e.g., DBpedia, ISNI).
  • agentIsConnectedToAgentRelation: Links showing relationships with other agents.
  • birthDate: Born on October 18, 1802.
  • deathDate: Died on June 5, 1873.
  • descriptiveNote: Details about Ludovic Vitet’s family, historical background, and achievements are described. For example, he was interested in archaeology and history, was active in journalism and literature, and was particularly involved in the preservation of French monuments, contributing to the establishment of the Historic Monuments Committee.

This record provides rich information about Ludovic Vitet’s life and achievements, detailing his lineage and his efforts in French cultural heritage preservation.

This is linked to the following Record via rico:isOrWasDescribedBy.

https://nakamura196.github.io/snorql_examples/rico/?describe=http%3A%2F%2Fdata.archives-nationales.culture.gouv.fr%2Frecord%2F051234

!

This data represents a record about “Ludovic Vitet (1802-1873)” at the French National Archives. This record includes the following information:

  • rdf:type: Classified as “Record.”
  • rdfs:seeAlso: Includes links to other reference materials about Ludovic Vitet (e.g., Wikipedia, ISNI).
  • creationDate: The record was created on December 3, 2015.
  • describesOrDescribed: This record describes the agent (person) related to Ludovic Vitet.
  • hasCreator: This record was created by the “French National Archives (agent/005061).”
  • hasDocumentaryFormType: The record format is “Authority Record.”
  • hasInstantiation: Includes a link to the instance associated with this record.
  • hasOrHadLanguage: The record language is French.
  • isOrWasRegulatedBy: Includes a link to rules associated with this record.
  • lastModificationDate: The record was last modified on October 25, 2019.
  • source: Includes links to various information sources about Ludovic Vitet (e.g., BnF, Academie francaise, Wikipedia).

This record provides detailed information about Ludovic Vitet’s life and achievements, indicating reliable information sources about him.

This is linked to the following agent via rico:hasCreator.

https://nakamura196.github.io/snorql_examples/rico/?describe=http%3A%2F%2Fdata.archives-nationales.culture.gouv.fr%2Fagent%2F005061

!

This data provides an overview of the French National Archives (Archives nationales). The following details are included:

  • rdf:type: Classified as “rico:Agent” and “rico:CorporateBody.”
  • rdfs:label: Named as “Archives nationales (France ; 1790-….).”
  • owl:sameAs: Links to other databases related to the French National Archives (e.g., DBpedia, BnF, ISNI).
  • rico:agentIsConnectedToAgentRelation: Contains URIs indicating relationships with other agents and activity periods.
  • rico:beginningDate: The founding date is indicated as “1790-01-01.”
  • rico:descriptiveNote: An explanation of the archives’ internal organization and history. It notes that the archives consist of three main sites:
    • Paris site: Archives from the Ancien Regime, Parisian notarial records
    • Pierrefitte-sur-Seine site: Post-revolutionary public and private archives
    • Fontainebleau site: Specific public archives (e.g., naturalization applications, Legion of Honor records)

Furthermore, it is explained that the archives are operated through four main departments (Public Department, Collections Department, Scientific Support Department, Administrative Department).

  • rico:groupIsTargetOfGroupSubdivisionRelation: Indicates group subdivision relationships within the archives.
  • rico:history: Provides a detailed historical background of the archives, established in 1790, describing the principles of centralized storage of historical public archives from across France and public accessibility. It also details the archives’ evolution and expansion of functions through various eras.

This record provides detailed organizational structure and historical background of the French National Archives, also explaining the evolution of functions and roles across different periods.

This is linked to the following agent via rico:isOrWasSubdivisionOf.

https://nakamura196.github.io/snorql_examples/rico/?describe=http%3A%2F%2Fdata.archives-nationales.culture.gouv.fr%2Fagent%2F000005

!

This data contains information about the French “Ministere de la Culture et de la Communication” (Ministry of Culture and Communication). Here is an overview:

  • rdf:type: Classified as “rico:Agent” and “rico:CorporateBody.”
  • rdfs:label: Named as “France. Ministere de la Culture et de la Communication (1959-….).”
  • owl:sameAs: This agent is linked to other databases including DBpedia, BnF (Bibliotheque nationale de France), and ISNI.
  • rico:agentIsConnectedToAgentRelation: Indicates relationships with other institutions and periods.
  • rico:beginningDate: The founding date of the Ministry of Culture and Communication is recorded as “1959-01-08.”
  • rico:descriptiveNote: Contains detailed explanations of the ministry’s organizational structure and history. Established in 1959, the Ministry of Culture and Communication consisted of various departments including architecture, French national archives, and arts and literature directorates, with structural changes and independence progressing over time. For example, the Arts and Literature Directorate was abolished in 1969, the music, dance, and opera department became independent in 1970, and the 2009 public policy reform reorganized the ministry into three main departments (Cultural Heritage, Artistic Creation, Media and Cultural Industries).
  • rico:history: Explains the historical development of the ministry. It was established in 1959 by Charles de Gaulle, with Andre Malraux appointed as the first Minister of Culture. Subsequently, multiple ministers shaped cultural policy directions, and from the 2000s, new areas focusing on digital technology and the internet were also addressed.

This record provides a detailed account of the Ministry of Culture and Communication’s development from its establishment and its connections with related institutions.

As shown above, we can confirm that various resources are linked through RDF.

Summary

This experience made me want to study Records in Contexts Ontology (RiC-O) more thoroughly.

I hope this serves as a useful reference.