Developing an RDF Metadata Management System Integrating GakuNin RDM and Dydra

Overview

This article describes the development of a metadata management system for research data that integrates GakuNin RDM (Research Data Management) with the Dydra RDF database. This system can handle file management for research projects and the registration and search of Dublin Core metadata in a unified manner.

System Overview

Architecture

┌─────────────────┐
│   Next.js 14    │
│   (App Router)  │
└────────┬────────┘
         │
    ┌────┴────┐
    │         │
┌───▼───┐ ┌──▼─────┐
│GakuNin│ │ Dydra  │
│  RDM  │ │  RDF   │
│  API  │ │   DB   │
└───────┘ └────────┘

Technology stack:

Next.js 14 (App Router)
NextAuth.js (OAuth 2.0 authentication)
Dydra (RDF database)
GakuNin RDM API
SPARQL (query language)

1. Integration with GakuNin RDM

1.1 OAuth 2.0 Authentication Implementation

GakuNin RDM supports OAuth 2.0 authentication. We implemented this using NextAuth.js.

// src/app/api/auth/[...nextauth]/authOptions.ts
export const authOptions: NextAuthOptions = {
  providers: [
    {
      id: "gakunin",
      name: "GakuNin RDM",
      type: "oauth",
      clientId: process.env.GAKUNIN_CLIENT_ID,
      clientSecret: process.env.GAKUNIN_CLIENT_SECRET,
      authorization: {
        url: "https://accounts.rdm.nii.ac.jp/oauth2/authorize",
        params: {
          scope: "osf.full_read osf.full_write",
          response_type: "code",
        },
      },
      token: "https://accounts.rdm.nii.ac.jp/oauth2/token",
      userinfo: "https://api.rdm.nii.ac.jp/v2/users/me/",
    },
  ],
};

1.2 Automatic Token Refresh

To maintain long-running sessions, we implemented automatic access token refresh.

async jwt({ token, account, user }) {
  // On initial login
  if (account) {
    token.accessToken = account.access_token;
    token.refreshToken = account.refresh_token;
    token.expiresAt = account.expires_at;
  }

  // Token expiration check (refresh 5 minutes before expiry)
  if (token.expiresAt) {
    const currentTime = Math.floor(Date.now() / 1000);
    const shouldRefresh = currentTime >= token.expiresAt - 300;

    if (shouldRefresh && token.refreshToken) {
      // Obtain new access token using refresh token
      const refreshedTokens = await refreshAccessToken(token.refreshToken);
      return {
        ...token,
        accessToken: refreshedTokens.access_token,
        refreshToken: refreshedTokens.refresh_token ?? token.refreshToken,
        expiresAt: Math.floor(Date.now() / 1000) + refreshedTokens.expires_in,
      };
    }
  }

  return token;
}

1.3 Retrieving Projects and Files

User project lists and files are retrieved through the GakuNin RDM API.

// Retrieve project list
const response = await fetch(
  "https://api.rdm.nii.ac.jp/v2/users/me/nodes/?filter[category]=project",
  {
    headers: {
      Authorization: `Bearer ${session.accessToken}`,
    },
  }
);

// Retrieve files from storage
const filesResponse = await fetch(
  `https://api.rdm.nii.ac.jp/v2/nodes/${projectId}/files/${provider}/`,
  {
    headers: {
      Authorization: `Bearer ${session.accessToken}`,
    },
  }
);

2. Using the Dydra RDF Database

2.1 Managing Private Data

Dydra is primarily a public RDF database, but private data management is possible by using API tokens.

Environment variable configuration:

DYDRA_ACCOUNT=your-account
DYDRA_REPOSITORY=your-repository
DYDRA_API_TOKEN=your-secret-token

Queries using API tokens:

const response = await fetch(
  `https://dydra.com/${account}/${repository}/sparql`,
  {
    method: 'POST',
    headers: {
      'Accept': 'application/sparql-results+json',
      'Authorization': `Bearer ${process.env.DYDRA_API_TOKEN}`,
    },
    body: new URLSearchParams({
      query: sparqlQuery,
    }),
  }
);

2.2 Data Separation with Named Graphs

Named Graphs are used to logically separate data for each project.

Named Graph design:

Resource URIs obtained from the GakuNin RDM API are used directly as Named Graph URIs, making the correspondence between data origin and graph clear.

Unified graph: https://api.rdm.nii.ac.jp/v2/nodes/{projectId}/

By consolidating all project-related data (metadata, SKOS subjects, profiles) into a single Named Graph, the following benefits are achieved:

Simple queries: No need to combine multiple graphs with UNION
Efficient search: Cross-searching of metadata and SKOS subjects is easy
Data consistency: All data is managed within the same graph
Interoperability with external systems: Direct correspondence between RDM resource URIs and RDF graphs
Simplified management: Reduced operational overhead through unified graph URIs

For example, when metadata references a SKOS concept via dc:subject, the subject label can be retrieved with a direct JOIN since they are in the same graph:

SELECT ?file ?title ?subjectLabel
FROM https://api.rdm.nii.ac.jp/v2/nodes/{projectId}/>
WHERE {
  ?file dc:title ?title ;
        dc:subject ?subject .
  ?subject skos:prefLabel ?subjectLabel .
}

Specifying Named Graph during metadata registration:

const graphUri = `https://api.rdm.nii.ac.jp/v2/nodes/${projectId}/`;

const insertQuery = `
  PREFIX dc:
  PREFIX dcterms:

  INSERT DATA {
    GRAPH graphUri}> {
      resourceUri}> a dcterms:BibliographicResource ;
        dc:title "${metadata.title}" ;
        dc:creator "${metadata.creator}" ;
        dc:description "${metadata.description}" ;
        dc:subject metadata.subject}> .
    }
  }
`;

Per-project search:

const graphUri = `https://api.rdm.nii.ac.jp/v2/nodes/${projectId}/`;

const searchQuery = `
  PREFIX dc:
  PREFIX dcterms:

  SELECT ?resource ?title ?creator ?description
  FROM graphUri}>
  WHERE {
    ?resource a dcterms:BibliographicResource ;
      dc:title ?title ;
      dc:creator ?creator .
    OPTIONAL { ?resource dc:description ?description }
    FILTER(CONTAINS(LCASE(?title), LCASE("${keyword}")))
  }
`;

2.3 Dublin Core Metadata Schema

All 15 Dublin Core elements are fully supported.

interface DublinCoreMetadata {
  title: string;        // dc:title
  creator: string;      // dc:creator
  subject: string;      // dc:subject
  description: string;  // dc:description
  publisher?: string;   // dc:publisher
  contributor?: string; // dc:contributor
  date?: string;        // dc:date
  type?: string;        // dc:type
  format?: string;      // dc:format
  identifier?: string;  // dc:identifier
  source?: string;      // dc:source
  language?: string;    // dc:language
  relation?: string;    // dc:relation
  coverage?: string;    // dc:coverage
  rights?: string;      // dc:rights
}

3. Metadata Registration and Search

3.1 Metadata Registration Flow

1. Retrieve file information from GakuNin RDM
2. User enters Dublin Core metadata
3. Convert to RDF triples
4. Register to Dydra via SPARQL UPDATE (using Named Graph)

Registration API implementation:

export async function POST(request: NextRequest) {
  const session = await getServerSession(authOptions);
  if (!session?.accessToken) {
    return NextResponse.json({ error: "Authentication required" }, { status: 401 });
  }

  const metadata = await request.json();

  // Use GakuNin RDM API project URI as Named Graph URI
  const graphUri = `https://api.rdm.nii.ac.jp/v2/nodes/${metadata.projectId}/`;

  // Build SPARQL INSERT query
  const insertQuery = buildInsertQuery(metadata, graphUri);

  // Register to Dydra
  const response = await fetch(
    `https://dydra.com/${account}/${repository}/sparql`,
    {
      method: 'POST',
      headers: {
        'Content-Type': 'application/sparql-update',
        'Authorization': `Bearer ${process.env.DYDRA_API_TOKEN}`,
      },
      body: insertQuery,
    }
  );

  return NextResponse.json({ success: true });
}

3.2 Advanced Search Features

Keyword search across multiple Dublin Core fields is implemented.

PREFIX dc:
PREFIX dcterms:

SELECT DISTINCT ?resource ?title ?creator ?subject ?description
FROM https://api.rdm.nii.ac.jp/v2/nodes/{projectId}/>
WHERE {
  ?resource a dcterms:BibliographicResource .
  OPTIONAL { ?resource dc:title ?title }
  OPTIONAL { ?resource dc:creator ?creator }
  OPTIONAL { ?resource dc:subject ?subject }
  OPTIONAL { ?resource dc:description ?description }

  # Search keyword across multiple fields
  FILTER(
    CONTAINS(LCASE(?title), LCASE("keyword")) ||
    CONTAINS(LCASE(?creator), LCASE("keyword")) ||
    CONTAINS(LCASE(?subject), LCASE("keyword")) ||
    CONTAINS(LCASE(?description), LCASE("keyword"))
  )
}
ORDER BY ?title

4. SKOS Subject Hierarchy Management

4.1 SKOS Concept Schema Implementation

SKOS (Simple Knowledge Organization System) was adopted to manage subject hierarchies.

interface SKOSConcept {
  uri: string;
  prefLabel: string;
  broader?: string;  // Broader concept
  narrower?: string[]; // Narrower concepts
}

SPARQL query for SKOS registration:

PREFIX skos:

INSERT DATA {
  GRAPH https://api.rdm.nii.ac.jp/v2/nodes/{projectId}/> {
    {conceptUri}> a skos:Concept ;
      skos:prefLabel "{label}"@ja ;
      skos:broader {broaderUri}> .
  }
}

4.2 Retrieving Hierarchy and Using Subject Labels

The unified graph makes it easy to query the relationship between metadata and subjects:

PREFIX skos:
PREFIX dc:

SELECT ?concept ?label ?broader ?broaderLabel
FROM https://api.rdm.nii.ac.jp/v2/nodes/{projectId}/>
WHERE {
  ?concept a skos:Concept ;
    skos:prefLabel ?label .
  OPTIONAL {
    ?concept skos:broader ?broader .
    ?broader skos:prefLabel ?broaderLabel .
  }
}
ORDER BY ?label

5. Project RDF Export Feature

5.1 Retrieving All Project RDF Data

The unified graph allows exporting all data (metadata, SKOS subjects, profiles) with a single CONSTRUCT query.

export async function GET(
  request: NextRequest,
  { params }: { params: { id: string } }
) {
  const projectId = params.id;
  const graphUri = `https://api.rdm.nii.ac.jp/v2/nodes/${projectId}/`;

  // Retrieve RDF graph with CONSTRUCT query (no UNION needed)
  const query = `
    PREFIX dc:
    PREFIX dcterms:
    PREFIX skos:
    PREFIX sh:

    CONSTRUCT {
      ?s ?p ?o
    }
    WHERE {
      GRAPH graphUri}> {
        ?s ?p ?o
      }
    }
  `;

  const response = await fetch(
    `https://dydra.com/${account}/${repository}/sparql`,
    {
      method: 'POST',
      headers: {
        'Accept': 'text/turtle',
        'Authorization': `Bearer ${process.env.DYDRA_API_TOKEN}`,
      },
      body: new URLSearchParams({ query }),
    }
  );

  const rdfData = await response.text();

  return new NextResponse(rdfData, {
    headers: {
      'Content-Type': 'text/turtle; charset=utf-8',
      'Content-Disposition': `attachment; filename="project_${projectId}.ttl"`,
    },
  });
}

6. UI Component Design

6.1 Metadata Editor

A form component that allows editing all 15 Dublin Core elements is implemented.

const MetadataEditor = ({ fileId, projectId }: Props) => {
  const [metadata, setMetadata] = useStateDublinCoreMetadata>({
    title: '',
    creator: '',
    subject: '',
    description: '',
    // ... other fields
  });

  const handleSave = async () => {
    const response = await fetch('/api/metadata/register', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        ...metadata,
        fileId,
        projectId,
      }),
    });

    if (response.ok) {
      alert('Metadata registered successfully');
    }
  };

  return (
    div className="space-y-4">
      {/* 15 Dublin Core input fields */}
      input
        value={metadata.title}
        onChange={(e) => setMetadata({ ...metadata, title: e.target.value })}
        placeholder="Title"
      />
      {/* ... */}
      button onClick={handleSave}>Savebutton>
    div>
  );
};

6.2 Search Interface

Both keyword search and field-specific search are supported.

const SearchInterface = ({ projectId }: Props) => {
  const [keyword, setKeyword] = useState('');
  const [fieldSearch, setFieldSearch] = useState({
    title: '',
    creator: '',
    subject: '',
    // ...
  });

  const handleSearch = async () => {
    const response = await fetch(
      `/api/projects/${projectId}/search`,
      {
        method: 'POST',
        body: JSON.stringify({ keyword, ...fieldSearch }),
      }
    );
    const results = await response.json();
    setResults(results);
  };

  return (
    div>
      input
        value={keyword}
        onChange={(e) => setKeyword(e.target.value)}
        placeholder="Search by keyword"
      />
      {/* Field-specific search form */}
      button onClick={handleSearch}>Searchbutton>
    div>
  );
};

7. Security and Performance

7.1 Authentication and Access Control

Session validation on all API endpoints
Automatic access token refresh
Per-project data separation via Named Graphs

7.2 Performance Optimization

Server-side rendering with Next.js App Router
SPARQL query optimization (using OPTIONAL clauses)
Preemptive token refresh (5 minutes before expiration)

8. Deployment and Infrastructure

8.1 Deploying to Vercel

# Environment variable configuration
NEXT_PUBLIC_SITE_URL=https://your-domain.com
NEXTAUTH_URL=https://your-domain.com
NEXTAUTH_SECRET=your-secret
GAKUNIN_CLIENT_ID=your-client-id
GAKUNIN_CLIENT_SECRET=your-client-secret
DYDRA_ACCOUNT=your-account
DYDRA_REPOSITORY=your-repository
DYDRA_API_TOKEN=your-api-token

# Deploy
vercel --prod

8.2 GakuNin RDM OAuth Configuration

Redirect URI registration:

https://your-domain.com/api/auth/callback/gakunin

Summary

This system solved the following technical challenges:

OAuth 2.0 authentication implementation - Secure integration with GakuNin RDM
Automatic token refresh - Maintaining long-running sessions
Data separation with Named Graphs - Per-project management
Private RDF data management - Access control via API tokens
Full Dublin Core support - Standard metadata schema implementation
SKOS subject hierarchy management - Structured subject classification
SPARQL search - Flexible metadata search

This system contributes to the promotion of open science as a practical solution for research data management.

References

Source Code

The complete source code is available at the following repository: https://github.com/nakamura196/next-dydra

Overview#

System Overview#

Architecture#

1. Integration with GakuNin RDM#

1.1 OAuth 2.0 Authentication Implementation#

1.2 Automatic Token Refresh#

1.3 Retrieving Projects and Files#

2. Using the Dydra RDF Database#

2.1 Managing Private Data#

2.2 Data Separation with Named Graphs#

2.3 Dublin Core Metadata Schema#

3. Metadata Registration and Search#

3.1 Metadata Registration Flow#

3.2 Advanced Search Features#

4. SKOS Subject Hierarchy Management#

4.1 SKOS Concept Schema Implementation#

4.2 Retrieving Hierarchy and Using Subject Labels#

5. Project RDF Export Feature#

5.1 Retrieving All Project RDF Data#

6. UI Component Design#

6.1 Metadata Editor#

6.2 Search Interface#

7. Security and Performance#

7.1 Authentication and Access Control#

7.2 Performance Optimization#

8. Deployment and Infrastructure#

8.1 Deploying to Vercel#

8.2 GakuNin RDM OAuth Configuration#

Summary#

References#

Source Code#