Overview

Odeuropa is a unique project that extracts descriptions of “scents” from European historical documents and structures them as Linked Data. This article explores the actual data through the SPARQL endpoint, revealing its structure and design philosophy.

What is Odeuropa?

Data Model Overview

Odeuropa uses an extended ontology specialized for scents, built on top of CIDOC-CRM (Conceptual Reference Model for Cultural Heritage).

Key Concepts and Relationships

Source (Document)
  ↓ P106_is_composed_of
Fragment (Text Fragment)
  ↓ P67_refers_to
  ├─ Emission (Emission Event) ←── Central Hub
  │    ├─ F3_had_source → Object (Source of Scent)
  │    └─ F1_generated → Smell (Scent)
  ├─ Smell (Scent)
  └─ Experience (Experience Event)
       └─ F2_perceived → Smell (Scent)

Key points:

  • Fragment directly references Emission, Smell, and Experience
  • Object is accessed through Emission (Fragment -> Emission -> Object)
  • Emission plays the central role of causally connecting Object and Smell

Learning the Data Structure Through Examples

Let us examine the data structure using a German agricultural book published in 1810, “Grundsaetze der rationellen Landwirthschaft” (Principles of Rational Agriculture), as an example.

1. Source (Document)

An entity that stores basic information about the document.

SELECT ?s ?author ?date ?language
WHERE {
  ?s a  ;
     rdfs:label "Grundsätze der rationellen Landwirthschaft"@de ;
      ?author ;
      ?date ;
      ?language .
}

Run query

Key properties:

  • rdfs:label: Title
  • schema:author: Author (Albrecht Daniel Thaer)
  • schema:dateCreated: Year of creation (1810)
  • schema:inLanguage: Language (de)
  • schema:genre: Genre (Household texts & recipes)
  • schema:locationCreated: Place of creation
  • P106_is_composed_of: Contained fragments

2. Fragment (Text Fragment)

A portion of text containing descriptions related to scents.

rdf:value "Sie ſind daher angefeuchtet ſchluͤpfrig und dehnbarer,
             geben einen Thongeruch von ſich, und trocknen zu feſten
             doch mehr zerreiblichen Klumpen zuſammen." ;
  schema:position 4 ;
  P106_is_composed_of "Sie", "Thongeruch" ;
  P67_refers_to , ,  ;
  P165i_is_incorporated_in  .

View data

Meaning of the text: “They (clays) become slippery and more elastic when moistened, emit a clay smell, and dry into solid but more crumbly lumps.”

Key properties:

  • rdf:value: Actual text content
  • schema:position: Position within the document (4th fragment)
  • P106_is_composed_of: Important words contained (“Sie”, “Thongeruch”)
  • P67_refers_to: Referenced concepts (Emission, Smell, Experience)
  • P165i_is_incorporated_in: Belonging document (Source)

3. Emission (Scent Emission Event)

Represents an event where a scent occurs. Emission is referenced from Fragment and connects Object and Smell.

a  ;
  F3_had_source  ;
  F1_generated  ;
  P92_brought_into_existence  ;
  P12_occurred_in_the_presence_of ,  ;
  time:hasTime  ;
  P67i_is_referred_to_by  .

View data

Key properties:

  • F3_had_source: Source of the scent (Object “Sie”)
  • F1_generated: Generated scent (Smell “Thongeruch”)
  • P92_brought_into_existence: The scent brought into existence
  • P12_occurred_in_the_presence_of: Entities present during the event (Object and Smell)
  • time:hasTime: Time of occurrence (1810)
  • P67i_is_referred_to_by: Fragment referencing this Emission

Role of Emission: Emission is the central event expressing the causal relationship of “which Object (source) generated which Smell (scent) and when.”

4. Object (Source of Scent)

The object or substance that emits a scent. Object is referenced from Emission.

a  ;
  rdfs:label "Sie" ;
  P12i_was_present_at  .

In this example, “Sie” (they) is a pronoun referring to clay or soil.

Types:

  • S10_Material_Substantial: Material substance
  • S15_Observable_Entity: Observable entity

Key properties:

  • rdfs:label: Object name (“Sie”)
  • P12i_was_present_at: Event where this object was present (Emission)

Connection path:

Fragment → (P67_refers_to) → Emission → (F3_had_source) → Object

View data

5. Smell (Scent)

The central concept representing a scent itself.

rdfs:label "Thongeruch" ;
  P92i_was_brought_into_existence_by  ;
  P140i_was_attributed_by  .

Key properties:

  • rdfs:label: Name of the scent (Thongeruch = clay smell)
  • P92i_was_brought_into_existence_by: Emission that generated this scent
  • P140i_was_attributed_by: Experience that perceived this scent

View data

6. Experience (Scent Experience Event)

An event where a person perceives or experiences a scent.

F2_perceived  ;
  O8_observed  ;
  P140_assigned_attribute_to  ;
  time:hasTime  .

Key properties:

  • F2_perceived: Perceived scent
  • O8_observed: Observed scent
  • P140_assigned_attribute_to: Target to which attributes were assigned
  • P14_carried_out_by: Experiencer (Actor)

View data

Data Flow: The Complete Story

1810, Germany

    Document "Principles of Rational Agriculture" (Source)
         ↓ P106_is_composed_of
    Text Fragment (Fragment)
      "Sie...geben einen Thongeruch von sich..."
         ↓ P67_refers_to
         ├─────────────┬─────────────┐
         ↓             ↓             ↓
    Emission      Smell         Experience
         ├─ F3_had_source ─→ Substance (Object): "Sie" = Clay
         └─ F1_generated ──→ Smell: "Thongeruch"
                                     │ F2_perceived
                               Experience
                               Observer: Thaer

Data flow explanation:

  1. Fragment directly references three concepts (Emission, Smell, Experience)
  2. Emission is the center of causal relationships:
    • From Object (source)
    • Generates Smell (scent)
  3. Experience perceives the Smell
  4. Object is indirectly connected to Fragment via Emission

SPARQL Query Examples

Searching by Language

When searching by German labels:

SELECT ?s ?label
WHERE {
  ?s rdfs:label "Grundsätze der rationellen Landwirthschaft"@de ;
     rdfs:label ?label .
}

Retrieving Visual Items with Images

Avoiding duplicates when multiple images exist:

SELECT ?s ?label ?image
WHERE {
  ?s a  ;
      ?image ;
     rdfs:label ?label .
}
LIMIT 100

Retrieving Scents and Their Sources

PREFIX od:

SELECT ?smell ?smellLabel ?object ?objectLabel
WHERE {
  ?emission a od:L12_Smell_Emission ;
            od:F3_had_source ?object ;
            od:F1_generated ?smell .
  ?smell rdfs:label ?smellLabel .
  ?object rdfs:label ?objectLabel .
}
LIMIT 100

Ontologies Used

CIDOC-CRM

  • E33_Linguistic_Object: Linguistic object (document)
  • E36_Visual_Item: Visual item
  • E39_Actor: Person (author, observer)
  • E53_Place: Place
  • E77_Persistent_Item: Persistent item
  • P67_refers_to: Refers to
  • P106_is_composed_of: Is composed of
  • P140_assigned_attribute_to: Assigned attribute to

CRMsci (Scientific Observation Extension)

  • S10_Material_Substantial: Material substance
  • S15_Observable_Entity: Observable entity
  • O8_observed: Observed

Odeuropa Custom Extensions

  • L12_Smell_Emission: Smell emission
  • F1_generated: Generated
  • F2_perceived: Perceived
  • F3_had_source: Had source

Schema.org

  • schema:author: Author
  • schema:dateCreated: Date created
  • schema:inLanguage: Language
  • schema:genre: Genre
  • schema:image: Image
  • schema:position: Position

Significance of the Project

The Odeuropa project is groundbreaking in the following respects:

  1. Digitization of sensory data: Structuring “scent,” a type of sensory information that was previously difficult to digitize
  2. Application to historical research: Enabling analysis of what past people perceived as scents and how
  3. Linked Data in practice: Implementing advanced Semantic Web technology using CIDOC-CRM
  4. Interdisciplinary approach: Combining history, information science, and sensory studies

Summary

The Odeuropa database is an ambitious project that leverages text mining, ontology design, and Linked Data technology to extract and structure the abstract concept of “scent” from historical documents.

Built on the established CIDOC-CRM cultural heritage ontology while adding scent-specific concepts (Emission, Experience), it achieves a reusable and highly extensible data model.

This approach can also be applied to the digitization of other sensory information (sound, taste, touch, etc.), demonstrating new possibilities for digital humanities.

References