In Archivematica, metadata schemas other than Dublin Core (DC) can be embedded in the AIP’s METS.xml. This guide uses source-metadata.csv to include non-DC metadata such as EAD and MODS in a Transfer and verifies via the API whether they are correctly stored in the AIP.
Table of Contents
- Background and Purpose
- How source-metadata.csv Works
- XML Validation Feature
- Test 1: MODS-Only Metadata Registration
- Test 2: Simultaneous EAD + MODS Registration
- Storage Format of Non-DC Metadata in METS.xml
- Test 3: Adding Metadata via Reingest
- Summary
Background and Purpose
In a standard Archivematica Transfer, Dublin Core metadata described in metadata/metadata.csv is stored in METS.xml as <dmdSec>. However, in actual digital archive operations, there are use cases requiring metadata schemas other than DC:
- EAD (Encoded Archival Description): A widely used standard for hierarchical archival description
- MODS (Metadata Object Description Schema): A schema used for detailed description of library materials
- LIDO: A description standard for museum and art gallery materials
- MARC21: A catalog data format for libraries
Archivematica provides a feature to associate arbitrary XML metadata with a Transfer through a CSV file called source-metadata.csv, and store them in the AIP’s METS.xml as <dmdSec>. This guide verifies this feature via the API.
How source-metadata.csv Works
CSV Format
source-metadata.csv is a CSV file placed in the Transfer’s metadata/ directory, consisting of three columns.
filename,metadata,type
objects,ead.xml,EAD
objects,mods.xml,MODS
objects/dir/file.pdf,file_metadata.xml,CustomType
| Column | Description |
|---|---|
filename | Relative path to the file or directory targeted by the metadata (starting with objects/) |
metadata | Path to the XML metadata file (relative to the metadata/ directory) |
type | Metadata type identifier. Used in the OTHERMDTYPE attribute of METS.xml |
Transfer Directory Structure
my-transfer/
├── objects/
│ └── test-document.txt <- Digital object to be preserved
└── metadata/
├── source-metadata.csv <- Mapping definition between metadata and objects
├── ead.xml <- EAD metadata
└── mods.xml <- MODS metadata
Associating Multiple Metadata with a Single File
In source-metadata.csv, different type metadata can be associated with the same filename across multiple rows.
filename,metadata,type
objects,ead.xml,EAD
objects,mods.xml,MODS
In this example, both EAD and MODS are associated with the objects directory (all files underneath), and each is stored as an independent <dmdSec> in METS.xml.
Processing Flow
source-metadata.csvis read during Transfer- The XML files specified in the
metadatacolumn are parsed - If XML Validation is enabled, validation against the schema is performed
- XML that passes validation is embedded in METS.xml as
<dmdSec>
XML Validation Feature
Overview
Archivematica has a feature that validates XML metadata specified in source-metadata.csv against schemas. This feature is disabled by default and is positioned as an experimental feature.
To enable it, set the MCP Client environment variables.
ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_METADATA_XML_VALIDATION_ENABLED=true
METADATA_XML_VALIDATION_SETTINGS_FILE=/path/to/xml_validation.py
Validation Configuration File
The validation configuration is written as a Python file. Below is a configuration example used in Archivematica’s test environment.
from pathlib import Path
__DIR = Path(__file__).parents[0] / "schemas"
XML_VALIDATION = {
"http://www.openarchives.org/OAI/2.0/oai_dc/": (__DIR / "oai_dc.xsd").as_posix(),
"http://www.lido-schema.org": (__DIR / "lido-v1.1.xsd").as_posix(),
"http://www.loc.gov/MARC21/slim": (__DIR / "MARC21slim.xsd").as_posix(),
"http://www.loc.gov/mods/v3": (__DIR / "mods.xsd").as_posix(),
"http://slubarchiv.slub-dresden.de/rights1": (__DIR / "rights1.xsd").as_posix(),
"alto": (__DIR / "alto-v2.0.xsd").as_posix(),
"metadata": None,
"bag-info": None,
}
XML_VALIDATION_FAIL_ON_ERROR = False
The keys of the XML_VALIDATION dictionary are matched against XML documents in the following order:
- Value of the
xsi:noNamespaceSchemaLocationattribute - Last value of the
xsi:schemaLocationattribute - Namespace URI of the root element
- Local name of the root element
If the value for a key is None, validation is skipped but storage in <dmdSec> still occurs. If the key does not exist in the dictionary, that metadata is silently skipped and is not stored in <dmdSec> either.
Supported Schemas in the Test Environment
| Namespace / Key | Metadata Type | Validation |
|---|---|---|
http://www.openarchives.org/OAI/2.0/oai_dc/ | Dublin Core (OAI-PMH) | XSD |
http://www.lido-schema.org | LIDO | XSD |
http://www.loc.gov/MARC21/slim | MARC21 | XSD |
http://www.loc.gov/mods/v3 | MODS | XSD |
http://slubarchiv.slub-dresden.de/rights1 | SLUB Rights | XSD |
alto | ALTO (OCR) | XSD |
metadata | General metadata | Skipped |
bag-info | BagIt information | Skipped |
Note: EAD (
urn:isbn:1-931666-22-9) is not included in the test environment’s default configuration. If using EAD, you must add an entry to the configuration file.
Test 1: MODS-Only Metadata Registration

Administration > Processing configuration screen – Manages processing profiles (default / automated / backlog) used when submitting Transfers
Test Environment
- Archivematica 1.19 (Docker environment)
- Dashboard:
http://127.0.0.1:62080 - Storage Service:
http://127.0.0.1:62081 - XML Validation: Enabled (test environment default settings)
Creating the Transfer Package
Create a Transfer package with the following structure.
metadata-test/
├── objects/
│ └── test-document.txt
└── metadata/
├── source-metadata.csv
├── ead.xml
└── mods.xml
source-metadata.csv:
filename,metadata,type
objects,ead.xml,EAD
objects,mods.xml,MODS
mods.xml:
xml version="1.0" encoding="UTF-8"?>
mods xmlns="http://www.loc.gov/mods/v3" version="3.6">
titleInfo>
title>Test Document for Metadata Validationtitle>
titleInfo>
name type="corporate">
namePart>Nakamura Test OrganizationnamePart>
role>
roleTerm type="text">creatorroleTerm>
role>
name>
typeOfResource>texttypeOfResource>
language>
languageTerm type="code" authority="iso639-2b">jpnlanguageTerm>
language>
abstract>Test document for verifying MODS metadata registration.abstract>
mods>
Running the Transfer via API
curl -X POST http://127.0.0.1:62080/api/v2beta/package/ \
-H "Authorization: ApiKey test:test" \
-H "Content-Type: application/json" \
-d '{
"name": "metadata-test",
"type": "standard",
"path": "",
"processing_config": "automated",
"auto_approve": true
}'
Results

Transfer tab – Transfer processing for metadata-test and metadata-test2 is complete, with each Microservice status displayed
Transfer -> Ingest completed and the AIP was created successfully. Checking the <dmdSec> in METS.xml, only MODS was stored as a dmdSec.
mets:dmdSec ID="dmdSec_2" CREATED="2026-02-17T01:51:41" STATUS="original">
mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="MODS">
mets:xmlData>
mods xmlns="http://www.loc.gov/mods/v3" version="3.6">
mods>
mets:xmlData>
mets:mdWrap>
mets:dmdSec>
Why EAD was not stored: The following error was recorded in the MCP Client log.
XML validation schema not found for keys: ['urn:isbn:1-931666-22-9', 'ead']
Because the EAD namespace (urn:isbn:1-931666-22-9) was not registered in the XML Validation configuration file, it was skipped during validation and not stored in <dmdSec>.
Test 2: Simultaneous EAD + MODS Registration
Modifying the XML Validation Configuration
To handle EAD, the EAD namespace was added to the configuration file.
XML_VALIDATION = {
# ... existing settings ...
"urn:isbn:1-931666-22-9": None, # EAD: Store in dmdSec without validation
}
By specifying None, XSD schema validation is skipped, and only storage in <dmdSec> of METS.xml is performed.
Running the Transfer via API
A new Transfer package (metadata-test2) was submitted via API using the same structure as before.
Results: METS.xml dmdSec

Archival Storage list – AIPs for metadata-test (Test 1) and metadata-test2 (Test 2) are stored successfully
This time, 3 dmdSec entries were generated.
dmdSec_1: PREMIS:OBJECT (standard)
mets:dmdSec ID="dmdSec_1" CREATED="2026-02-17T01:57:54" STATUS="original">
mets:mdWrap MDTYPE="PREMIS:OBJECT">
mets:xmlData>
premis:object xsi:type="premis:intellectualEntity">
premis:objectIdentifier>
premis:objectIdentifierValue>7e26ac5e-ef3b-4f17-8717-f5239bbe355fpremis:objectIdentifierValue>
premis:objectIdentifier>
premis:object>
mets:xmlData>
mets:mdWrap>
mets:dmdSec>
dmdSec_2: EAD
mets:dmdSec ID="dmdSec_2" CREATED="2026-02-17T01:57:54" STATUS="original">
mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="EAD">
mets:xmlData>
ead xmlns="urn:isbn:1-931666-22-9">
eadheader>
eadid>metadata-test-002eadid>
filedesc>
titlestmt>
titleproper>Test Collection for EAD Metadata Validationtitleproper>
titlestmt>
filedesc>
eadheader>
archdesc level="collection">
did>
unittitle>Test Collection for EAD Metadata Validationunittitle>
unitdate type="inclusive" normal="2024/2025">2024-2025unitdate>
unitid>META-TEST-002unitid>
did>
archdesc>
ead>
mets:xmlData>
mets:mdWrap>
mets:dmdSec>
dmdSec_3: MODS
mets:dmdSec ID="dmdSec_3" CREATED="2026-02-17T01:57:54" STATUS="original">
mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="MODS">
mets:xmlData>
mods xmlns="http://www.loc.gov/mods/v3" version="3.6">
titleInfo>
title>Test Document for MODS Metadata Validationtitle>
titleInfo>
mods>
mets:xmlData>
mets:mdWrap>
mets:dmdSec>
Association in structMap
In the structMap, both EAD and MODS are associated with the objects directory as DMDID="dmdSec_2 dmdSec_3".
mets:structMap TYPE="physical" ID="structMap_1" LABEL="Archivematica default">
mets:div TYPE="Directory" LABEL="metadata-test2-..." DMDID="dmdSec_1">
mets:div TYPE="Directory" LABEL="objects" DMDID="dmdSec_2 dmdSec_3">
mets:div TYPE="Item" LABEL="test-document.txt">
mets:fptr FILEID="file-..."/>
mets:div>
mets:div>
mets:div>
mets:structMap>
Storage Format of Non-DC Metadata in METS.xml
MDTYPE Attribute Handling
The value specified in the type column of source-metadata.csv is stored in METS.xml as follows.
mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="EAD">
Examining Archivematica’s source code (archivematicaCreateMETSMetadataXML.py), metadata from source-metadata.csv is always stored as MDTYPE="OTHER", and the type column value is set in the OTHERMDTYPE attribute.
fsentry.add_dmdsec(
tree.getroot(),
"OTHER",
othermdtype=xml_type,
status="update" if "REIN" in sip_type else "original",
)
This differs from Dublin Core (stored as MDTYPE="DC" via metadata.csv).
STATUS Attribute
- Initial Ingest:
STATUS="original" - Re-ingest:
STATUS="update"
During re-ingest, existing dmdSec entries with the same type are treated as superseded, and new dmdSec entries are added with STATUS="update".
XML File Storage Location
XML files referenced in source-metadata.csv are also stored as files within the AIP.
data/objects/metadata/transfers/-/ead.xml
data/objects/metadata/transfers/-/mods.xml
data/objects/metadata/transfers/-/source-metadata.csv
Test 3: Metadata Update via Reingest

AIP detail screen – Showing the UUID, size, storage location, and METS file download link for metadata-test2
Purpose of the Test
Perform a Metadata re-ingest on the AIP created in Test 2 (containing EAD + MODS) and verify the following:
- When existing MODS metadata is updated, is the old metadata retained as
supersededand the new metadata added asupdate? - Are XML files added during re-ingest stored within the AIP?
Starting the Reingest

Re-ingest tab on the AIP detail screen – Select the Reingest type (Metadata / Partial / Full) and Processing config to execute. This time, Metadata re-ingest is executed via the API
Start the Metadata re-ingest using the Storage Service API.
curl -X POST "http://127.0.0.1:62081/api/v2/file//reingest/" \
-H "Authorization: ApiKey test:test" \
-H "Content-Type: application/json" \
-d '{
"pipeline": "",
"reingest_type": "metadata"
}'
{
"error": false,
"message": "Package 7e26ac5e-... sent to pipeline ... for re-ingest",
"reingest_uuid": "7e26ac5e-...",
"status_code": 202
}
Adding Metadata
When reingest starts, the AIP is extracted and submitted to Archivematica’s Ingest workflow. Before approving “Approve AIP reingest”, place new metadata files in the extracted AIP’s data/objects/metadata/ directory.
Important: During reingest, only objects/metadata/source-metadata.csv (root level) is processed. CSVs under objects/metadata/transfers/ are only read during the initial Ingest.
data/objects/metadata/
├── source-metadata.csv <- Mapping file for reingest (newly created)
├── mods-updated.xml <- Updated MODS (newly created)
├── dc-reingest.xml <- Newly added DC metadata (newly created)
└── transfers/ <- Initial Ingest metadata (existing, do not modify)
└── metadata-test2-/
├── ead.xml
├── mods.xml
└── source-metadata.csv
source-metadata.csv (for Reingest):
filename,metadata,type
objects,dc-reingest.xml,DC-CUSTOM
objects,mods-updated.xml,MODS
Specifying the same value as an existing dmdSec in the type column (MODS) causes the existing dmdSec to become superseded and a new dmdSec to be added as update. Specifying a new type value (DC-CUSTOM) adds a new dmdSec.
Approving and Processing the Reingest
# Approve the reingest
curl -X POST "http://127.0.0.1:62080/api/ingest/reingest/approve/" \
-H "Authorization: ApiKey test:test" \
-d "uuid="
After approval, the Ingest workflow proceeds. Even with Metadata re-ingest, decision points such as Normalize and Transcribe must be passed (select manually if the automated processing config is not applied).
Results: METS.xml After Reingest
The METS.xml after reingest contained 4 dmdSec entries.
| dmdSec | STATUS | TYPE | Description |
|---|---|---|---|
| dmdSec_1 | original | PREMIS:OBJECT | SIP identification (unchanged) |
| dmdSec_2 | original | OTHER(EAD) | Initial Ingest EAD (unchanged) |
| dmdSec_3 | original-superseded | OTHER(MODS) | Initial Ingest MODS (changed to superseded) |
| dmdSec_4 | update | OTHER(MODS) | Updated MODS added via Reingest |
dmdSec_3 (old MODS -> superseded):
mets:dmdSec ID="dmdSec_3" CREATED="2026-02-17T01:57:54" STATUS="original-superseded">
mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="MODS">
mets:xmlData>
mods xmlns="http://www.loc.gov/mods/v3" version="3.6">
titleInfo>
title>Test Document for MODS Metadata Validationtitle>
titleInfo>
mods>
mets:xmlData>
mets:mdWrap>
mets:dmdSec>
dmdSec_4 (new MODS -> update):
mets:dmdSec ID="dmdSec_4" CREATED="2026-02-17T02:15:22" STATUS="update">
mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="MODS">
mets:xmlData>
mods xmlns="http://www.loc.gov/mods/v3" version="3.6">
titleInfo>
title>Test Document - MODS Updated via Reingesttitle>
titleInfo>
originInfo>
dateCreated encoding="w3cdtf">2025-01-15dateCreated>
dateModified encoding="w3cdtf">2025-06-01dateModified>
originInfo>
note>This MODS record was updated via metadata reingest.note>
mods>
mets:xmlData>
mets:mdWrap>
mets:dmdSec>
Changes in structMap
After reingest, the structMap references all dmdSec entries including superseded ones for the objects directory.
mets:div TYPE="Directory" LABEL="objects" DMDID="dmdSec_2 dmdSec_3 dmdSec_4">
Additionally, metadata files added during reingest are also stored as files within the AIP.
mets:div TYPE="Directory" LABEL="metadata">
mets:div TYPE="Directory" LABEL="transfers">...mets:div>
mets:div TYPE="Item" LABEL="dc-reingest.xml">...mets:div>
mets:div TYPE="Item" LABEL="mods-updated.xml">...mets:div>
mets:div TYPE="Item" LABEL="source-metadata.csv">...mets:div>
mets:div>
Why DC-CUSTOM Was Not Included in dmdSec
dc-reingest.xml (in Dublin Core Terms format) was stored as a file within the AIP but was not stored in dmdSec. The following error was recorded in the MCP Client log.
XML validation schema not found for keys: ['http://purl.org/dc/terms/', 'dcterms']
The Dublin Core registered in the XML Validation configuration was only the OAI-PMH format (http://www.openarchives.org/OAI/2.0/oai_dc/), and the DC Terms format (http://purl.org/dc/terms/) was not registered. Since XML Validation registration rules are based on namespace URIs, even the same Dublin Core requires separate registration if the namespace differs.
Summary
Summary of Test Results
| Test Item | Result |
|---|---|
| MODS metadata storage in dmdSec | Success |
| EAD metadata storage in dmdSec | Success (after adding XML Validation configuration) |
| Associating multiple metadata with the same object | Success (simultaneous EAD + MODS storage) |
| dmdSec association in structMap | Correct (DMDID="dmdSec_2 dmdSec_3") |
| MODS metadata update via Reingest | Success (old: original-superseded, new: update) |
| structMap update after Reingest | Correct (DMDID="dmdSec_2 dmdSec_3 dmdSec_4") |
| Storage of files added via Reingest within AIP | Success |
| Storage of unregistered namespace DC Terms | Failed (not registered in XML Validation configuration) |
Important Notes
XML Validation Configuration: In environments where XML Validation is enabled, the namespaces of the metadata schemas you use must be registered in the validation configuration file. If unregistered, metadata is silently skipped. Errors are only recorded in logs, and the Ingest / Re-ingest process itself continues.
MDTYPE Handling: Metadata via
source-metadata.csvis always stored asMDTYPE="OTHER". It is not stored as METS-standard MDTYPE values likeMDTYPE="MODS"orMDTYPE="EAD".source-metadata.csv Location During Reingest: During initial Ingest,
metadata/transfers/<transfer-name>/source-metadata.csvis used, but during Reingest, onlymetadata/source-metadata.csv(root level) is processed.Metadata Versioning via the type Column: The
typecolumn insource-metadata.csvfunctions as an identifier for metadata updates during Reingest. Using the sametypevalue causes the existing dmdSec to becomesuperseded, while using a newtypevalue adds a new dmdSec.Namespace-Based Registration: Even for the same metadata standard (e.g., Dublin Core), if the namespace URI used differs, each must be registered separately in the XML Validation configuration.