This article was co-written with a generative AI. Facts have been cross-checked against official documentation where possible, but errors may remain. Please verify important details with primary sources before acting on them.

Account names and IDs are written as placeholders. Examples:

  • Cloudflare account / Workers subdomain: <cf-subdomain> (so the Worker URL is https://jsonkeeper.<cf-subdomain>.workers.dev)
  • Firebase project ID: <firebase-pid>
  • D1 database ID: <d1-database-id> (issued by wrangler d1 create)

Replace each with your own values when following along.

A companion article — Deploying JSONkeeper on PythonAnywhere with HTTP API Only — walks through running upstream IllDepence/JSONkeeper on PythonAnywhere's free Beginner plan, with modern-Python compatibility patches. This one covers the same problem space (the save backend for the IIIF Curation Viewer) but instead of running the upstream Flask app, rewrites it from scratch on Cloudflare Workers + D1.

The source is at nakamura196/jsonkeeper-workers (MIT) — refer to it alongside this article.

I run the two side-by-side: the PA install as a "cold-storage" upstream-faithful reference, and this Workers + D1 service as the hot path. The articles are deliberately independent — you can read either one in isolation — though there's a comparison table near the end of this one.

TL;DR

  • 360 lines of TypeScript under src/: index.ts (191) + auth.ts (86) + curation.ts (57) + activitystreams.ts (26). Hono as the router, jose for JWT verification (the project pins v5.9; the API used here is fully compatible with the current v6.x as well).
  • No Firebase Admin SDK. Firebase ID tokens (RS256) are verified against x509 public keys fetched from securetoken.googleapis.com, checking iss = https://securetoken.google.com/<pid> and aud = <pid> through jose's jwtVerify. No service-account key file is required. Note that Admin SDK's verifyIdToken(idToken, true) adds token-revocation checks (tokensValidAfterTime) and disabled-user checks that require Auth REST API round-trips — jose alone cannot reproduce those.
  • Storage is D1 (SQLite-on-edge). One table documents, five indexes (owner_uid, jsonld_type, created_at, unlisted).
  • JSON-LD @id rewriting handles both the top level and nested nodes: a Curation document's top-level @id becomes the storage URL, and nested matching nodes (e.g. Range nodes) get fragment identifiers (<docUrl>#frag-N) reattached recursively.
  • Free tier: 100k Worker requests/day + D1 5 GB / 5 M reads / 100 k writes per day. For Viewer-scale traffic this is effectively unlimited.
  • CORS is configured via Hono's cors() middleware with explicit allowance of X-Firebase-ID-Token, X-Access-Token, X-Unlisted, and Location is exposed so the Viewer can read it.
  • Deployment is four wrangler commands: wrangler loginwrangler d1 create jsonkeeper → write the returned database_id into wrangler.tomlwrangler d1 migrations apply jsonkeeper --remotewrangler deploy.

Starting point

ItemValue
Upstream implementationIllDepence/JSONkeeper (Python 3 + Flask + SQLAlchemy + firebase_admin); the same codebase that was operated at mp.ex.nii.ac.jp/api/curation/json
Alternative path I also builtThe same upstream running on PythonAnywhere's free Beginner plan (separate article)
Issues with the PA path(1) 100 CPU-seconds/day on the Beginner plan, hit often by pip install during dependency updates; (2) ~200 ms RTT from Japan (PA web workers run in EU/US regions); (3) the upstream's dependency stack (Flask 1.0 → 2.0 pin, firebase_admin, google-cloud-*) is heavy and accrues maintenance debt over time
What the Viewer actually exercisesPOST/GET/PUT/DELETE, CORS, the Location response header, automatic JSON-LD @id rewriting, and X-Firebase-ID-Token verification — that's it

The last row was the call. The Viewer's export workflow uses only a small subset of JSONkeeper's features. Specifically, it never touches X-Unlisted, the /<id>/status PATCH endpoint, the Activity Stream change-discovery pagination, garbage collection, or the per-Range sub-URLs. Given that, the simpler long-term path is to implement just the I/F the Viewer needs rather than keep the full upstream alive.

Landing state

ItemValue
URLhttps://jsonkeeper.<cf-subdomain>.workers.dev/ (deployed as of this writing)
Language / frameworkTypeScript + Hono on the Cloudflare Workers runtime
AuthorizationFirebase ID tokens → RS256 verification via jose (Google x509 public keys fetched + cached)
StorageD1 (SQLite-on-edge), single documents table
EndpointsGET /, POST /api, GET /api/userdocs, GET / PUT / DELETE /api/:id, GET /as/collection.json
Upstream parity△ The slice the Viewer touches is complete. Upstream's X-Unlisted / /<id>/status PATCH / Activity Stream pagination / GC / Range sub-URLs are intentionally omitted
Code size360 TypeScript lines; runtime dependencies are hono + jose only

1. Why Cloudflare Workers + D1

For "the Viewer's save backend" specifically, two options were on the table: PythonAnywhere with upstream Flask, and Cloudflare Workers + D1 with a custom rewrite. The summary comparison (the detailed version is at the end of this article):

AxisPA + Flask (upstream)Workers + D1 (this article)
Practical free quota100 CPU-sec/day (web-request handling is exempt, but dependency updates burn through it)100k requests/day + D1 5 GB / 5 M reads / 100 k writes per day
Latency from Japan~200 ms (PA serves from EU/US regions)10–30 ms (Cloudflare edges in Tokyo/Osaka)
Cold startPossible after long idleAlways warm at the edge
Outbound networkWhitelistedUnrestricted
DurabilityPA SQLite file (account-bound)D1 (Cloudflare-managed, replicated)
Upstream parity◎ Full△ Viewer-needed surface only
Maintenance debtFlask 1.0 → 2.0 pin, firebase_admin, google-cloud-*hono + jose only

Once it became clear that "Viewer-needed surface only" was an acceptable cut, the lower-ops side (Workers + D1) won. Given the open-ended timeline, the deciding criterion was which option is easier to leave running, untouched, for months at a time.

2. Scoping — which features to implement

Implementing every upstream feature would be overkill for what the Viewer's export plugin (icv.exportJsonKeeper.js) actually calls. The cut:

FeatureUpstreamWorkers versionDecision
POST /api (create)Required
GET /api/:idRequired
PUT /api/:id (overwrite)Required (edit flow)
DELETE /api/:idRequired
GET /api/userdocs (your documents)Adopted (Viewer's "my Curations" list may call this)
Location response headerRequired (the Viewer redirects to ?curation=<Location>)
X-Firebase-ID-Token authorizationRequired
Anonymous POSTAdopted (Viewer's allowAnonymousPost: true configuration)
JSON-LD @id rewriting (top + nested)Required (nested Range nodes inside a cr:Curation recursively)
X-Unlisted: true on POST only (PUT inherits the existing value)△ DB column present, logic not wiredPostponed
/<id>/status PATCHOut of scope (Viewer doesn't call it)
Activity Stream Collection✓ (paginated)△ Single page onlyScope-reduced
Garbage collectionOut of scope (a Workers cron trigger would be the equivalent; add later if needed)
Range sub-URL (/<id>/range<n>)Out of scope (Viewer doesn't use it)

The result of cutting at "necessary and sufficient for the Viewer" was 360 lines.

3. Project layout

jsonkeeper-workers/
├── package.json              ; two deps: hono, jose
├── tsconfig.json             ; strict, ES2022, @cloudflare/workers-types
├── wrangler.toml             ; D1 binding, env vars
├── migrations/
│   ├── 0001_init.sql         ; documents table + indexes
│   └── 0002_unlisted.sql     ; adds the unlisted column
├── src/
│   ├── index.ts              ; Hono router (191 lines)
│   ├── auth.ts               ; Firebase JWT verification (86 lines)
│   ├── curation.ts           ; JSON-LD @id rewriting (57 lines)
│   └── activitystreams.ts    ; AS Collection builder (26 lines)
└── test/
    └── smoke.sh              ; four-case smoke test

package.json (excerpt):

{
  "name": "jsonkeeper-workers",
  "type": "module",
  "scripts": {
    "dev": "wrangler dev",
    "deploy": "wrangler deploy",
    "typecheck": "tsc --noEmit",
    "db:create": "wrangler d1 create jsonkeeper",
    "db:migrate:local": "wrangler d1 migrations apply jsonkeeper --local",
    "db:migrate:remote": "wrangler d1 migrations apply jsonkeeper --remote",
    "tail": "wrangler tail"
  },
  "dependencies": {
    "hono": "^4.6.14",
    "jose": "^5.9.6"
  },
  "devDependencies": {
    "@cloudflare/workers-types": "^4.20241218.0",
    "typescript": "^5.7.2",
    "wrangler": "^3.99.0"
  }
}

wrangler.toml:

name = "jsonkeeper"
main = "src/index.ts"
compatibility_date = "2024-12-01"

[vars]
FIREBASE_PROJECT_ID = "<firebase-pid>"
REWRITE_TYPES = "http://codh.rois.ac.jp/iiif/curation/1#Curation,http://iiif.io/api/presentation/2#Range"

[[d1_databases]]
binding = "DB"
database_name = "jsonkeeper"
database_id = "<d1-database-id>"
migrations_dir = "migrations"

[observability]
enabled = true

The two [vars] entries are not secrets so they live in wrangler.toml directly. FIREBASE_PROJECT_ID is used only to compute iss/aud for JWT verification — leaking it does not weaken the auth gate. This is the moral equivalent of the upstream's config.ini reduced to two vars.

tsconfig.json enables strict, noUnusedLocals, and noImplicitAny; @cloudflare/workers-types is the only types entry, and Node-standard globals are intentionally not pulled in.

4. Decision 1 — authorization: don't use the Firebase Admin SDK

This is the single biggest design choice. Upstream calls firebase_admin.auth.verify_id_token(idtoken), but Firebase Admin SDK's Python implementation cannot run on the Workers runtime (no Python). The replacement:

  • Use jose for RS256 verification
  • Fetch the x509-formatted public-key set from https://www.googleapis.com/robot/v1/metadata/x509/securetoken@system.gserviceaccount.com
  • Pass iss = https://securetoken.google.com/<firebase-pid> and aud = <firebase-pid> to jwtVerify

This is functionally equivalent to verifyIdToken(idToken) with checkRevoked=false (the default). The service-account key file becomes a non-issue — PA-side you'd put firebase-adminsdk.json on the server and design a rotation procedure for it; on Workers you skip that entire concern.

Caveat: Admin SDK's verifyIdToken(idToken, true) (checkRevoked=true) does extra work — namely the token-revocation check (tokensValidAfterTime comparison) and disabled-user check (user.disabled) — that requires a service-account-authenticated request to the Identity Toolkit API. jose alone cannot reproduce those. For the Curation-save use case the worst-case slip is "a session that logged out 30 minutes ago can still write until the token's natural expiration (≤1 h)", which I consider acceptable. If your use case can't tolerate that, you'll need to add a separate Firebase REST API call or proxy through another runtime that has the Admin SDK available.

The implementation lives in src/auth.ts. The core:

const GOOGLE_X509_URL =
  'https://www.googleapis.com/robot/v1/metadata/x509/securetoken@system.gserviceaccount.com';

type JwkCache = { keys: Record<string, CryptoKey>; expiresAt: number };
let jwkCache: JwkCache | null = null;

async function getSigningKey(kid: string): Promise<CryptoKey> {
  if (!jwkCache || Date.now() >= jwkCache.expiresAt) {
    const res = await fetch(GOOGLE_X509_URL);
    if (!res.ok) throw new Error(`JWK fetch failed: ${res.status}`);
    const certs = (await res.json()) as Record<string, string>;
    const maxAge = parseMaxAge(res.headers.get('cache-control')) ?? 3600;
    const keys: Record<string, CryptoKey> = {};
    for (const [k, pem] of Object.entries(certs)) {
      keys[k] = (await importX509(pem, 'RS256')) as CryptoKey;
    }
    jwkCache = { keys, expiresAt: Date.now() + maxAge * 1000 };
  }
  const key = jwkCache.keys[kid];
  if (!key) { jwkCache = null; throw new Error(`Unknown signing kid: ${kid}`); }
  return key;
}

export async function verifyFirebaseIdToken(
  token: string,
  projectId: string,
): Promise<{ uid: string; email?: string }> {
  const header = decodeProtectedHeader(token);
  if (header.alg !== 'RS256') throw new Error(`Unexpected alg: ${header.alg}`);
  if (!header.kid) throw new Error('Missing kid');

  const key = await getSigningKey(header.kid);
  const { payload } = await jwtVerify(token, key, {
    issuer: `https://securetoken.google.com/${projectId}`,
    audience: projectId,
    algorithms: ['RS256'],
  });
  if (!payload.sub) throw new Error('Missing sub');
  return { uid: payload.sub, email: payload.email as string | undefined };
}

A few things about the public-key cache:

  • A single module-scoped jwkCache. Workers isolates handle "consecutive requests from the same edge location" with reasonable affinity, so an isolate-local in-memory cache works as a soft per-edge cache. It's not strict — multiple isolates each warm independently — but it's vastly cheaper than fetching per request.
  • Respect Cache-Control: max-age. Google's x509 endpoint returns a sensible max-age; the cache expiry uses that value with a 1-hour fallback. This automatically follows key rotation.
  • Drop the cache if kid isn't found. When a new key rotates in, the next request re-fetches, and recovery is automatic.

The middleware is short:

export function authMiddleware(opts: { required: boolean }): MiddlewareHandler<AuthEnv> {
  return async (c, next) => {
    const token = extractToken(c);
    if (!token) {
      if (opts.required) return c.json({ error: 'Authentication required' }, 401);
      return next();
    }
    try {
      const { uid, email } = await verifyFirebaseIdToken(token, c.env.FIREBASE_PROJECT_ID);
      c.set('uid', uid);
      if (email) c.set('email', email);
    } catch (e) {
      if (opts.required) return c.json({ error: 'Invalid token', detail: String(e) }, 401);
    }
    return next();
  };
}

function extractToken(c: Context): string | undefined {
  const direct = c.req.header('X-Firebase-ID-Token') ?? c.req.header('x-firebase-id-token');
  if (direct) return direct;
  const auth = c.req.header('Authorization') ?? c.req.header('authorization');
  if (!auth) return undefined;
  const [scheme, value] = auth.split(/\s+/, 2);
  return scheme?.toLowerCase() === 'bearer' ? value : undefined;
}

X-Firebase-ID-Token (the header the Viewer's plugin actually sends) is the primary; Authorization: Bearer ... (the standard header) is supported as a fallback. Routes mark themselves { required: true } (PUT/DELETE) or { required: false } (POST).

5. Decision 2 — storage: lean on D1

D1 is Cloudflare's SQLite-as-a-service. Migration file conventions and wrangler d1 migrations apply work much like Rails or Django.

migrations/0001_init.sql:

CREATE TABLE IF NOT EXISTS documents (
  id            TEXT PRIMARY KEY,
  json          TEXT NOT NULL,
  owner_uid     TEXT,
  content_type  TEXT NOT NULL DEFAULT 'application/json',
  jsonld_type   TEXT,
  created_at    INTEGER NOT NULL,
  updated_at    INTEGER NOT NULL
);

CREATE INDEX IF NOT EXISTS idx_documents_owner   ON documents(owner_uid);
CREATE INDEX IF NOT EXISTS idx_documents_type    ON documents(jsonld_type);
CREATE INDEX IF NOT EXISTS idx_documents_created ON documents(created_at DESC);

migrations/0002_unlisted.sql:

ALTER TABLE documents ADD COLUMN unlisted INTEGER NOT NULL DEFAULT 0;
CREATE INDEX IF NOT EXISTS idx_documents_unlisted ON documents(unlisted);

Schema decisions:

  • The json column stores the entire stringified JSON. D1 has JSON functions, but the Viewer only ever reads and writes whole documents, so parsing on the TypeScript side is enough.
  • A separate jsonld_type column lets the Activity Stream's filter run through an index. WHERE jsonld_type IN (...) never has to do a full table scan.
  • owner_uid is nullable to accommodate anonymous POSTs (allowAnonymousPost: true). When it's NULL the Workers code refuses PUT and DELETE — anonymous documents are effectively immutable.
  • created_at / updated_at are UNIX-seconds integers — smaller than ISO strings, sortable as-is, and easy to ISO-format with new Date(n * 1000).

The POST /api handler (src/index.ts:50-85):

app.post('/api', authMiddleware({ required: false }), async (c) => {
  const raw = await c.req.text();
  let parsed: unknown;
  try { parsed = JSON.parse(raw); }
  catch { return c.json({ error: 'Invalid JSON' }, 400); }

  const id = crypto.randomUUID();
  const origin = serverOrigin(c.req.url);
  const docUrl = `${origin}/api/${id}`;
  const types = rewriteTypes(c.env);
  const matched = detectTopLevelType(parsed as never, types);
  const stored = matched ? rewriteIds(parsed as never, types, docUrl) : parsed;

  const now = Math.floor(Date.now() / 1000);
  const uid = c.get('uid') ?? null;
  await c.env.DB.prepare(
    `INSERT INTO documents (id, json, owner_uid, content_type, jsonld_type, created_at, updated_at)
     VALUES (?, ?, ?, ?, ?, ?, ?)`,
  ).bind(id, JSON.stringify(stored), uid, c.req.header('Content-Type') ?? 'application/json',
         matched, now, now).run();

  c.header('Location', docUrl);
  return c.json(stored as Record<string, unknown>, 201);
});

PUT and DELETE both go through authMiddleware({ required: true }) plus an owner_uid === uid check. Anonymously-POSTed documents have owner_uid IS NULL and stay un-editable from then on — matching the upstream's behavior.

6. Decision 3 — JSON-LD @id rewriting

When the Viewer exports a Curation, the JSON-LD it sends has @type cr:Curation at the top level and several Range nodes nested inside. Upstream JSONkeeper rewrites not only the top-level @id but also the nested Range nodes' @id so that each Range gets its own dereferenceable URL — the Viewer's range-highlighting feature breaks without it.

src/curation.ts:

export function rewriteIds(
  doc: JsonValue, rewriteTypes: string[], docUrl: string,
): JsonValue {
  if (!isObject(doc)) return doc;
  const cloned = deepClone(doc);
  let counter = 0;
  const walk = (node: JsonValue, isTop: boolean) => {
    if (Array.isArray(node)) { for (const item of node) walk(item, false); return; }
    if (!isObject(node)) return;
    const types = toArray(node['@type']).filter((t): t is string => typeof t === 'string');
    const matches = types.some((t) => rewriteTypes.includes(t));
    if (matches) {
      node['@id'] = isTop ? docUrl : `${docUrl}#frag-${counter++}`;
    }
    for (const key of Object.keys(node)) {
      if (key === '@type' || key === '@id') continue;
      walk(node[key], false);
    }
  };
  walk(cloned, true);
  return cloned;
}
  • deepClone so the input isn't mutated. The POST response body returns the rewritten JSON, so the same object can't be both serialized to D1 and returned to the client without copying first.
  • isTop distinguishes top-level from nested. The top level gets docUrl verbatim; nested matches get sequential <docUrl>#frag-0, <docUrl>#frag-1, ... Using a fragment lets multiple nodes inside the same document have distinct URLs.
  • @type can be a string or an array in JSON-LD; toArray normalizes both shapes.
  • Skip @type and @id when recursing. Strictly speaking @id is a string so the walk wouldn't enter it anyway, but being explicit avoids surprises.

Upstream (in jsonkeeper/subroutines.py:handle_incoming_json_ld) takes a different shape: it runs pyld.jsonld.expand on the root element only to check the top-level @type, and when the document is cr:Curation it switches to a hardcoded loop over selections that rewrites each Range's @id to a path-based <docUrl>/range<n> (where n starts at 1) — no type-checking on the nested loop, just a Curation-specific special case (subroutines.py:L226-L233). The Workers version drops the pyld dependency and instead recursively type-checks every node against rewriteTypes, which is a more generic shape at the cost of being less JSON-LD-aware.

The nested @id format also differs: upstream's path-based <docUrl>/range1 vs Workers' fragment-based <docUrl>#frag-0. The path form is closer to truly dereferenceable URLs (fragments are dropped by browsers), but the Workers approach trades that off against the cost of also implementing /api/:id/range1 style sub-route handlers (which the Viewer never calls) — net win for keeping the code size down.

Trade-offs of not using pyld:

  • ✓ Fewer dependencies (only jose apart from hono)
  • ✓ Smaller bundle (the Workers Free-plan ceiling is 3 MB gzipped / 64 MB raw; Paid is 10 MB gzipped)
  • ✓ Insensitive to @context resolution failures (e.g. when an external context URL like codh.rois.ac.jp/iiif/curation/1/context.json is unreachable)
  • ✗ Not strictly correct JSON-LD expansion — it identifies @type by string match against the compact form

The last trade-off is fine for this use case: the Viewer always emits compact-form JSON with known @type URIs.

7. Decision 4 — Activity Stream stays minimal

Upstream's Activity Stream conforms to the IIIF Change Discovery API 0.1 (conformance level 2), with OrderedCollection + OrderedCollectionPage pagination and a separate Create / Update / Reference / Offer activity vocabulary. That's a fair amount of code.

The Workers version reduces this to a single OrderedCollection page with Create/Update activities listed flat. The entire src/activitystreams.ts:

export function buildCollection(serverUrl: string, rows: ActivityRow[]) {
  const collectionId = `${serverUrl}/as/collection.json`;
  const items = rows.map((r) => ({
    id: `${collectionId}#activity-${r.id}`,
    type: r.created_at === r.updated_at ? 'Create' : 'Update',
    endTime: new Date(r.updated_at * 1000).toISOString(),
    object: {
      id: `${serverUrl}/api/${r.id}`,
      type: r.jsonld_type ?? undefined,
    },
  }));
  return {
    '@context': 'https://www.w3.org/ns/activitystreams',
    id: collectionId,
    type: 'OrderedCollection',
    totalItems: items.length,
    orderedItems: items,
  };
}

And in src/index.ts:

app.get('/as/collection.json', async (c) => {
  const types = rewriteTypes(c.env);
  const placeholders = types.map(() => '?').join(',') || "''";
  const { results } = await c.env.DB.prepare(
    `SELECT id, jsonld_type, created_at, updated_at
     FROM documents WHERE jsonld_type IN (${placeholders}) ORDER BY created_at DESC LIMIT 5000`,
  ).bind(...types).all<ActivityRow>();
  const collection = buildCollection(serverOrigin(c.req.url), results);
  return new Response(JSON.stringify(collection), {
    status: 200,
    headers: {
      'Content-Type': 'application/activity+json',
      'Access-Control-Allow-Origin': '*',
    },
  });
});

LIMIT 5000 is a hard cap; pagination is not implemented. The Viewer doesn't read this collection, so it doesn't matter for the Viewer's use case. External crawlers that need true change-discovery semantics would require pagination, which is left as a deferred feature — easy to add when there's a real consumer asking for it.

8. CORS and Viewer-compatible headers

Hono's cors() middleware, with the custom headers spelled out:

app.use(
  '*',
  cors({
    origin: '*',
    allowMethods: ['GET', 'POST', 'PUT', 'DELETE', 'OPTIONS'],
    allowHeaders: [
      'Content-Type',
      'Accept',
      'Authorization',
      'X-Firebase-ID-Token',
      'X-Access-Token',
      'X-Unlisted',
    ],
    exposeHeaders: ['Location'],
    maxAge: 86400,
  }),
);

Three points:

  • exposeHeaders: ['Location'] is the one custom header the Viewer needs to read off the response. Without this entry CORS hides Location from the JS side, and the Viewer's post-save redirect breaks.
  • allowHeaders covers every custom request header the Viewer plugin sends: X-Firebase-ID-Token (auth), X-Access-Token (anonymous tokens), and X-Unlisted (visibility). Listed to match upstream's surface.
  • maxAge: 86400 keeps preflight responses in the browser cache for 24 hours so OPTIONS doesn't fire repeatedly through the edge.

9. Deployment runbook

Splitting between local shell and one-time browser actions.

9-1. Browser, once:

  1. Create a Cloudflare account (if you don't have one)
  2. From your local shell, run npx wrangler login — a browser opens, you authorize, done

9-2. From your local shell:

# 1. clone + install
git clone https://github.com/nakamura196/jsonkeeper-workers.git
cd jsonkeeper-workers
npm install

# 2. Create the D1 database; copy the printed database_id
npx wrangler d1 create jsonkeeper
# Created database 'jsonkeeper' with id 9270a2b6-8420-...

Open wrangler.toml in an editor and paste the database_id into database_id = "...".

# 3. Apply migrations to the remote D1
npx wrangler d1 migrations apply jsonkeeper --remote

# 4. Make sure FIREBASE_PROJECT_ID under [vars] in wrangler.toml
#    matches your Firebase project ID.

# 5. (Optional) Local dev
npx wrangler dev
# → http://127.0.0.1:8787

# 6. Deploy
npx wrangler deploy
# → Uploaded jsonkeeper (X.YZ sec)
# → Published jsonkeeper
# →   https://jsonkeeper.<cf-subdomain>.workers.dev

Your service is now live at https://jsonkeeper.<cf-subdomain>.workers.dev/.

In the Firebase Console, under Authentication → Settings → Authorized domains, add the GitHub Pages domain where you host the Viewer. (This is Viewer-side login configuration, not Workers-side, but it's easy to forget.)

10. Smoke test

test/smoke.sh driven by a BASE environment variable:

$ BASE=https://jsonkeeper.<cf-subdomain>.workers.dev ./test/smoke.sh
1/4 root
{"name":"jsonkeeper-workers","endpoints":["POST /api","GET /api/:id","PUT /api/:id","DELETE /api/:id","GET /api/userdocs","GET /as/collection.json"]}

2/4 POST anonymous
  Location: https://jsonkeeper.<cf-subdomain>.workers.dev/api/c3881e3f-...

3/4 GET back
  Body: {"hello":"workers"}

4/4 POST Curation (JSON-LD @id rewrite)
  Body: {"@type":"http://codh.rois.ac.jp/iiif/curation/1#Curation","@id":"https://jsonkeeper.<cf-subdomain>.workers.dev/api/...","label":"x"}

OK

Case 4 is the important one: the request body's "@id":"about:blank" is rewritten in the response to "@id":"<Location header URL>". The Viewer sends an "about:blank" or similar placeholder @id on first save because the eventual storage URL isn't yet known — so this round-trip passing is what makes the whole Viewer export flow work.

11. Feature-by-feature diff against upstream

Concretely what's there vs. what isn't:

FeatureUpstreamWorkers versionNote
POST /api createEquivalent
GET /api/:idEquivalent
PUT /api/:id (overwrite)Equivalent; only owner_uid match allowed
DELETE /api/:idEquivalent
GET /api/userdocs✓ (configurable userdocs_added_properties)✓ (fixed: jsonld_type/created_at/updated_at)Simplified
X-Firebase-ID-Token auth✓ (firebase_admin SDK)✓ (jose + Google x509)No service-account key needed
X-Access-Token (self-managed token)The Viewer never sends this. Adding it is ~30 lines if you need it
Anonymous POSTEquivalent
JSON-LD @id rewrite (top)Equivalent
JSON-LD @id rewrite (nested)✓ (Curation-only hardcoded loop over selections, produces /range<n>)✓ (generic recursive @type check, produces #frag-<n>)URL form differs (path vs fragment)
X-Unlisted: true POST only (PUT inherits, change via /<id>/status PATCH)△ (column only)Logic not wired
/<id>/status PATCHViewer doesn't use it
Activity Stream✓ (paginated)△ (single page)Simplified
Garbage collection✓ (apscheduler)Workers cron triggers can be added; not implemented
Range sub-URL (/<id>/range<n>, n starts at 1)✗ (Workers uses fragment #frag-<n> instead)Viewer doesn't use it
CORS✓ (* + reflect in preflight)✓ (* + explicit allowHeaders)Equivalent in behavior
Location headerEquivalent
DurabilitySQLite fileD1 (replicated)Operationally more robust

12. Local vs. production

Copy .dev.vars.example to .dev.vars and run wrangler dev; the local server uses .dev.vars values instead of [vars] from wrangler.toml. D1 runs against a local SQLite file (.wrangler/state/v3/d1/...), so local development never touches production.

cp .dev.vars.example .dev.vars
npx wrangler d1 migrations apply jsonkeeper --local
npx wrangler dev
# in another terminal
BASE=http://127.0.0.1:8787 ./test/smoke.sh

.gitignore includes .dev.vars, .wrangler/, and node_modules/.

13. Rollback (reverting to the upstream URL)

# Delete the Worker
npx wrangler delete

# Delete the D1 database (export first if you want a backup)
npx wrangler d1 export jsonkeeper --output backup.sql
npx wrangler d1 delete jsonkeeper

Restoring the Viewer's curationJsonExportUrl to mp.ex.nii.ac.jp/api/curation/json is a one-line sed (covered in the PA-side article).

14. Why I keep both PA and Workers

  • Workers + D1 is the hot path. Free-tier headroom is enormous, edge latency from Japan is competitive, and the maintenance surface is tiny (hono and jose only). The Viewer-visible behavior is sufficient.
  • The PA install is kept in cold storage because there are scenarios where strict upstream parity (X-Unlisted, the paginated Activity Stream, GC, Range sub-URLs) matters — for example a third-party tool subscribing to JSONkeeper's change feed, or a comparison reference to the original deployment.

When the upstream URL returns, the plan is to retire both and switch the Viewer back to mp.ex.nii.ac.jp. Until then I'd rather have two backends with clear roles than a single one I have to baby-sit.

Closing thoughts

Rewriting JSONkeeper on Cloudflare Workers + D1 came down to selecting the Viewer-touching subset and reimplementing it in 360 lines. A few things stand out from the exercise:

  • The Firebase Admin SDK is, at heart, a JWT-verification routine. With jose and a fetched x509 set you can match its default behavior; only revocation and disabled-user checks remain outside the reach of a Worker (and those need an Auth REST API round-trip). For most write-side use cases this is acceptable, and getting rid of the service-account key file is a real operational win.
  • Upstream's nested-@id rewrite is hardcoded selections traversal, not pyld-driven recursion. The Workers version uses a more general @type walk that's strictly simpler. Either approach satisfies the Viewer.
  • D1 is a clean SQLite/SQLAlchemy replacement. Migration file format and wrangler d1 migrations apply slot in without surprises.
  • Workers' free quota (100k req/day + D1 5 GB) is effectively unbounded for an IIIF Curation back-of-house. The interim-mirror traffic scale we're handling is so far below the cap that it's not worth measuring.

When the upstream URL returns, both the PA and Workers deployments will be retired, and the Viewer will point at mp.ex.nii.ac.jp again. Until then, the Workers version is the one I expect to be running, and PA is the upstream-faithful reference next to it.

If you're in the same situation and want a save side for IIIF Curation tooling that can sit untouched for months, I hope this is useful.