Overview

This is a memo from trying out “py-resourcesync,” a Python library for ResourceSync.

https://github.com/resourcesync/py-resourcesync

Setup

git clone https://github.com/resourcesync/py-resourcesync
cd py-resourcesync
python setup install

Execution

resourcelist

First, create the output resource_dir directory. An ex_resource_dir folder will be created in the current directory.

resource_dir = "ex_resource_dir"
!mkdir -p $resource_dir

Next, execute the following. You would modify the generator as needed, but here the sample EgGenerator is used.

from resourcesync.resourcesync import ResourceSync
# from my_generator import MyGenerator
from resourcesync.generators.eg_generator import EgGenerator
my_generator = EgGenerator()

metadata_dir = "ex_metadata_dir" # Change as appropriate.
rs = ResourceSync(strategy=0,
                      resource_dir=resource_dir,
                      metadata_dir=metadata_dir)
rs.generator = my_generator
rs.execute()

As a result, .well_known, capabilitylist.xml, and resourcelist_0000.xml are created in ex_resource_dir/ex_metadata_dir.

<?xml version='1.0' encoding='UTF-8'?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:rs="http://www.openarchives.org/rs/terms/">
    <rs:ln href="http://www.example.com/.well-known/resourcesync" rel="up" />
    <rs:md capability="capabilitylist" />
    <url>
        <loc>http://www.example.com/metadata_dir/resourcelist_0000.xml</loc>
        <rs:md capability="resourcelist" />
    </url>
</urlset>
<?xml version='1.0' encoding='UTF-8'?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:rs="http://www.openarchives.org/rs/terms/">
    <rs:ln href="http://www.example.com/metadata_dir/capabilitylist.xml" rel="up" />
    <rs:md at="2022-11-21T00:13:46Z" capability="resourcelist" completed="2022-11-21T00:13:46Z" />
    <url>
        <loc>http://www.resourcesync.org</loc>
        <lastmod>2016-10-01T00:00:00Z</lastmod>
        <rs:md hash="md5:cc9895a21e335bbe66d61f2b62ce3a8e" length="20" type="application/xml" />
    </url>
</urlset>

changelist

By changing strategy to 1, new_changelist can be created, and by changing it to 2, inc_changelist can be created.

Resourcedump and Changedump

It appears that resourcedump can be created by changing strategy to 3, and changedump by changing it to 4, but I was unable to fully understand the configuration methods for these and they remain unverified.

Summary

This was helpful in investigating implementation methods for ResourceSync.

We hope this serves as a useful reference for others as well.