This article was co-written with a generative AI. I cross-checked the facts against primary sources where I could, but errors may remain. Please consult the primary sources before relying on it for important decisions.
What I did
IIIF 3D Viewer is a Next.js app for viewing glTF/GLB 3D models and their annotations as delivered by IIIF Manifests. Until now it ran on IIIF Presentation API 3.0 manifests with a project-local extension — a custom 3DSelector for points / polygons and a camPos field on the selector for the recommended camera state.
I aligned the viewer with the IIIF 3D Technical Specification Group draft (Presentation API 4 / temp-draft-4). Concretely:
- The internal pipeline now treats
Scene/PointSelector/WKTSelector/PerspectiveCamera(the shapes used by the TSG examples) as the canonical v4 form - Legacy Presentation 3 + custom-extension manifests are converted to v4 in-memory at the fetch boundary
- The bundled
manifests/*.jsonsamples are rewritten in v4 form
Presentation API 4 is still a draft, but matching the structure used by the TSG examples (e.g. 9_commenting_annotations) is a small step toward future standardisation and interoperability with other viewers.
Legacy form vs. v4 draft
For reference, here is the diff between the two shapes.
Manifest skeleton
| Aspect | Legacy (P3 + custom) | v4 / 3D TSG |
|---|---|---|
@context | http://iiif.io/api/presentation/3/context.json | http://iiif.io/api/presentation/4/context.json |
| Top-level child | Manifest → items → Canvas | Manifest → items → Scene |
| Model placement | painting Annotation attached to Canvas (no spatial position) | painting Annotation + PointSelector on the Scene |
| Camera | selector.camPos inside the annotation | standalone PerspectiveCamera Annotation |
| Polygon | custom area: [x,y,z,...] flat array | WKTSelector with POLYGON Z((...)) |
| Georeferencing | custom GeoJSON-T (motivation: "georeferencing") | not in v4 (kept as project extension) |
Commenting annotation (point)
Legacy:
{
"motivation": "commenting",
"body": { "value": "<p></p>", "label": "North Sea", "type": "TextualBody" },
"target": {
"source": ".../canvas-p1",
"selector": {
"type": "3DSelector",
"value": [-0.2557, 0.7615, -0.5854],
"camPos": [-0.378, 0.874, -0.911]
}
}
}
v4:
{
"motivation": ["commenting"],
"bodyValue": "Right pterygoid hamulus",
"target": {
"type": "SpecificResource",
"source": [{ "id": ".../scene1", "type": "Scene" }],
"selector": [{ "type": "PointSelector", "x": 0.040, "y": 0.063, "z": -0.066 }]
}
}
The visible differences: motivation appears as an array; target is a SpecificResource; both source and selector are arrays; and PointSelector carries its position as object properties {x, y, z} rather than a tuple value: [x, y, z]. The array form for motivation isn’t explicitly required by the v4 prose — it’s the convention used consistently across the TSG examples.
Polygon
The legacy form had an ad-hoc area: [x,y,z,...] flat list of vertex coordinates. v4 expresses polygons as Well-Known Text strings. The naming is not yet settled: the temp-draft-4 prose defines this selector as PolygonZSelector, while the TSG’s example manifests (e.g. whale_comment_point_polygon.json) use WKTSelector. The viewer follows the example form and uses WKTSelector.
"selector": [
{
"type": "WKTSelector",
"value": "POLYGON Z ((0 0.18 -0.23, -0.03 0.16 -0.23, -0.015 0.12 -0.23, 0.006 0.12 -0.23, 0.027 0.16 -0.23))"
}
]
Camera
Previously the recommended camera position for an annotation lived in the same selector object as camPos. In v4 it becomes a separate Annotation whose body references a PerspectiveCamera and whose target is a PointSelector on the Scene.
{
"motivation": ["painting"],
"body": {
"type": "SpecificResource",
"source": [{ "type": "PerspectiveCamera" }]
},
"target": {
"type": "SpecificResource",
"source": [{ "id": ".../scene1", "type": "Scene" }],
"selector": [{ "type": "PointSelector", "x": -0.378, "y": 0.874, "z": -0.911 }]
}
}
The PerspectiveCamera itself carries properties like fieldOfView / near / far, and the surrounding SpecificResource (the annotation body) can hold a sibling transform array (e.g. RotateTransform) next to source, per the draft examples. For now this viewer emits a minimal PerspectiveCamera and ties it back to the originating commenting annotation by id.
How the annotation and the camera are linked
In the legacy form one selector carried both value (a point on the model surface) and camPos (the recommended camera position), so the pairing was implicit. In v4 they live in separate annotations, so the linkage has to be made explicit.
This viewer uses the following convention:
Scene "scene-p1"
│
├─ AnnotationPage "page/comments"
│ └─ Annotation id: "anno-1" motivation: ["commenting"]
│ body: TextualBody { label: "North Sea" }
│ target: PointSelector { x, y, z } ← point on the model surface
│ cameraAnnotation: "anno-1/camera" ─┐
│ │ id reference
└─ AnnotationPage "page/cameras" │
└─ Annotation id: "anno-1/camera" ←─────┘ motivation: ["painting"]
body: SpecificResource → { type: "PerspectiveCamera" }
target: PointSelector { x, y, z } ← recommended camera position
- The commenting annotation carries the marker position (the point on the model surface) in its
target.selector - The matching camera position is the
target.selectorof a separate annotation - The two are linked by a custom
cameraAnnotationproperty and by the${commentingId}/cameranaming convention - The parser first indexes the
PerspectiveCameraannotations asid → [x, y, z], then looks each commenting annotation’scameraAnnotation(or fallback${id}/camera) up against that index
cameraAnnotation is not a v4 spec term — it is a property local to this viewer. As the draft evolves a more standard linkage (mutual source references, refinedBy, etc.) could replace it.
Migration approach
I wanted the viewer code to deal with v4 only, while still reading legacy manifests produced by the existing editor. So the conversion happens at the fetch boundary.
fetchManifest(url)
→ convertToV4(raw) // P3 + custom → v4 (in-memory)
→ parseManifestV4(v4) // v4 → internal Annotation[] / GeoFeature[]
→ React components
The internal Annotation shape is unchanged — only the IIIF ↔ internal boundary moves to v4.
File layout
New:
src/types/iiif.d.ts— TypeScript types for v4 (ManifestV4,SceneV4,PointSelectorV4,WKTSelectorV4,SpecificResourceV4,AnnotationV4, …)src/lib/services/manifestConverter.ts—convertToV4(unknown): ManifestV4src/lib/services/manifestParser.ts—parseManifestV4(ManifestV4)returning{ modelUrl, annotations, geoFeatures }
Edited:
ViewerContent.tsx/GeoRefContent.tsx— replace the in-line manifest parsing withconvertToV4 → parseManifestV4Annotations.tsx— marker selection now branches on'PointSelector'instead of'3DSelector'manifestAtom— the type changes fromManifest(@iiif/presentation-3) toManifestV4public/manifests/sample-manifest*.json— rewritten in v4 form
Inside the converter
The converter takes a manifest of unknown shape and returns a v4 plain object. If @context already points to Presentation 4 it is returned untouched; otherwise it walks the canvases, rewrites selectors, and lifts camPos into separate PerspectiveCamera annotations.
The commenting annotation case looks like this (excerpt; the real code also passes through bodyValue and seeAlso on the source annotation, omitted here for brevity):
const convertCommentingAnnotation = (
anno: LegacyAnnotation,
sceneId: string,
): { annotation: AnnotationV4; camera: AnnotationV4 | null } => {
const target = (anno.target ?? {}) as LegacyTargetObject;
const selector: LegacySelector = (typeof target === 'object' && target.selector) || {};
const value = asTriple(selector.value);
const camPos = asTriple(selector.camPos);
const area = selector.area && selector.area.length >= 9 ? selector.area : null;
const v4Selector = area
? { type: 'WKTSelector' as const, value: buildWktPolygon(area) }
: value
? buildPointSelector(value)
: null;
const annotationId =
anno.id ?? `${sceneId}/anno/${Math.random().toString(36).slice(2, 10)}`;
const v4Target: SpecificResourceV4 = {
type: 'SpecificResource',
source: buildSceneSource(sceneId),
...(v4Selector ? { selector: [v4Selector] } : {}),
};
const annotation: AnnotationV4 = {
id: annotationId,
type: 'Annotation',
motivation: normalizeMotivation(anno.motivation).length
? normalizeMotivation(anno.motivation)
: ['commenting'],
target: v4Target,
...(anno.body !== undefined ? { body: anno.body as AnnotationV4['body'] } : {}),
...(camPos ? { cameraAnnotation: `${annotationId}/camera` } : {}),
};
if (!camPos) return { annotation, camera: null };
const camera: AnnotationV4 = {
id: `${annotationId}/camera`,
type: 'Annotation',
motivation: ['painting'],
body: {
type: 'SpecificResource',
source: [{ type: 'PerspectiveCamera' }],
},
target: {
type: 'SpecificResource',
source: buildSceneSource(sceneId),
selector: [buildPointSelector(camPos)],
},
};
return { annotation, camera };
};
A few notes:
- The legacy
areais a flat list of 3D vertex coordinates, so it gets stitched into aPOLYGON Z((x y z, x y z, ...))WKT string camPosis hoisted into a standalone Annotation. The originating annotation keeps acameraAnnotationproperty pointing at the camera annotation by id, using a${annotationId}/cameranaming conventionmotivationis normalised to an array, andtargetis rebuilt as aSpecificResourcewith array-valuedsource/selector
georeferencing annotations (with a GeoJSON-T body) have no v4 standard counterpart, so the body is preserved as-is and only the target is rewritten in v4 form.
A smoke test on a minimal legacy manifest confirms the rewrite pattern:
{
"@context": "http://iiif.io/api/presentation/4/context.json",
"items": [{
"type": "Scene",
"items": [{ "type": "AnnotationPage", "items": [/* model painting */] }],
"annotations": [
{ "type": "AnnotationPage", "items": [/* commenting */] },
{ "type": "AnnotationPage", "items": [/* PerspectiveCamera */] }
]
}]
}
Canvas becomes Scene, 3DSelector becomes PointSelector, and camPos is moved out into its own AnnotationPage as expected.
Inside the parser
Going the other way — v4 manifest into the viewer’s internal structures — the parser pulls the model URL from the painting / Model body in Scene.items, then flattens Scene.annotations and routes each entry into one of three buckets: commenting, georeferencing, or PerspectiveCamera.
PerspectiveCamera annotations are indexed up front as Map<string, [x,y,z]>, so each commenting annotation can look up its camera by cameraAnnotation (or, as a fallback, the ${id}/camera naming convention). For WKTSelector the parser does a forgiving read of POLYGON Z((...)) and stores the vertices in the internal area flat array, with position set to the centroid for use as the marker anchor.
const parseWktPolygonZ = (wkt: string): number[] => {
const open = wkt.indexOf('((');
const close = wkt.indexOf('))', open);
if (open < 0 || close < 0) return [];
const ring = wkt.slice(open + 2, close);
const out: number[] = [];
for (const raw of ring.split(',')) {
const [x, y, z] = raw.trim().split(/\s+/).map(Number);
if ([x, y, z].every((n) => Number.isFinite(n))) out.push(x, y, z);
}
return out;
};
It is a deliberately sloppy WKT parser, sufficient to round-trip the strings the converter emits.
Sample manifest
public/manifests/sample-manifest-with-annotations.json is now v4. The georeferencing block keeps its GeoJSON-T body (still a project extension) but its target is rebuilt as a v4 SpecificResource. Commenting annotations use motivation: ["commenting"], a PointSelector { x, y, z } selector, and a cameraAnnotation reference into a separate PerspectiveCamera annotation.
There are two ways to write the body for a commenting annotation: the TSG examples shown earlier use a plain bodyValue: "..." string, while the sample manifest uses Web Annotation’s body: { type: "TextualBody", value, label } so that the existing UI can keep separating the HTML value from the label. The parser accepts both shapes.
Open questions
- Presentation API 4 is still
temp-draft-4, and the TSG’s commenting examples (9_commenting_annotations) lean on plain-textbodyValue. The body shape and camera-annotation conventions may shift before the spec is final - v4 doesn’t (yet) cover Linked Data enrichment of annotation bodies (e.g. URIs to Wikidata items) or 3D extensions of the IIIF Georeference Extension, so those remain project-local for now