-
Notifications
You must be signed in to change notification settings - Fork 3
fix(presentation-2): handle malformed manifests with incorrect structure #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Fix annotation parsing when @type is not oa:Annotation (e.g., dctypes:Image) by detecting annotations via presence of 'on' property in images array - Handle v2-context manifests that use v3 structure (items instead of sequences) by converting v3 Canvas/Annotation/body structure to v2 before traversal Fixes issues with St. Andrews manifests that use non-standard structures
|
This pull request is automatically built and testable in CodeSandbox. To see build info of the built libraries, click here or the icon next to each commit SHA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds handling for malformed IIIF Presentation API v2 manifests that incorrectly use v3 structure (items instead of sequences). The fix enables conversion of St. Andrews manifests and similar non-standard documents by detecting v3 structure in v2-context manifests and converting them to proper v2 format before traversal.
Key changes:
- Detect and convert v3-style structure (items/AnnotationPages) to v2-style structure (sequences/canvases/images) in manifests with v2 context
- Add three new conversion methods to handle Canvas, Annotation, and body conversions from v3 to v2 format
- Include test fixture and test case for the new conversion functionality
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 13 comments.
| File | Description |
|---|---|
src/presentation-2/traverse.ts |
Adds v3-to-v2 conversion logic in traverseManifestItems and three new helper methods (convertV3CanvasToV2, convertV3AnnotationToV2, convertV3BodyToV2Resource) to transform malformed manifests |
fixtures/presentation-2/st-andrews-malformed.json |
Adds test fixture representing a real-world St. Andrews manifest with v2 context but v3 structure (items array with Canvas/AnnotationPage/Annotation hierarchy) |
__tests__/presentation-2-parser/upgrade.test.ts |
Adds test case verifying the conversion of v2-context/v3-structure manifests to proper v3 format |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| convertV3AnnotationToV2(annotation: any, canvasId: string): any { | ||
| return { | ||
| '@id': annotation.id, | ||
| '@type': 'oa:Annotation', | ||
| motivation: annotation.motivation === 'painting' ? 'sc:painting' : annotation.motivation, | ||
| on: annotation.target || canvasId, | ||
| resource: this.convertV3BodyToV2Resource(annotation.body), | ||
| }; |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion does not handle the case where annotation.body is an array of bodies. In IIIF Presentation API v3, the body property can be an array of resources. The code should check if annotation.body is an array and handle multiple bodies appropriately (e.g., by converting each or taking the first one).
| if (!manifest.sequences && (manifest as any).items && Array.isArray((manifest as any).items)) { | ||
| manifest.sequences = [ | ||
| { | ||
| '@id': `${manifest['@id']}/sequence/0`, | ||
| '@type': 'sc:Sequence', | ||
| canvases: (manifest as any).items.map((item: any) => this.convertV3CanvasToV2(item)), | ||
| } as any, | ||
| ]; | ||
| delete (manifest as any).items; |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directly mutating the input manifest object by setting manifest.sequences and deleting (manifest as any).items can cause side effects if the same manifest object is used elsewhere. Consider creating a shallow copy of the manifest before mutation, or document that this function mutates the input.
| '@id': canvas.id, | ||
| '@type': 'sc:Canvas', | ||
| label: canvas.label, | ||
| height: canvas.height, | ||
| width: canvas.width, | ||
| }; | ||
|
|
||
| // Convert v3 items (AnnotationPages) to v2 images array | ||
| if (canvas.items && Array.isArray(canvas.items)) { | ||
| v2Canvas.images = []; | ||
| for (const annotationPage of canvas.items) { | ||
| if (annotationPage.items && Array.isArray(annotationPage.items)) { | ||
| for (const annotation of annotationPage.items) { | ||
| v2Canvas.images.push(this.convertV3AnnotationToV2(annotation, canvas.id)); |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code does not validate that canvas.id is defined before using it on line 264 and 277. If a v3 canvas is missing the id property, this will result in undefined values being assigned to @id properties. Add validation to handle missing IDs gracefully.
| return { | ||
| '@id': annotation.id, |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code does not validate that annotation.id is defined before using it on line 291. If a v3 annotation is missing the id property, this will result in an undefined value for the @id property. Add validation to handle missing annotation IDs.
| return { | |
| '@id': annotation.id, | |
| const hasValidId = | |
| annotation && | |
| typeof annotation.id === 'string' && | |
| annotation.id.length > 0; | |
| // Fallback: if the v3 annotation has no id, derive a stable identifier | |
| // from its target (if available) or from the canvasId to avoid an | |
| // undefined @id in the v2 annotation. | |
| const fallbackId = | |
| (annotation && typeof annotation.target === 'string' && annotation.target) || | |
| canvasId; | |
| return { | |
| '@id': hasValidId ? annotation.id : fallbackId, |
| // If it's already v2 style, return as-is | ||
| if (canvas['@type'] || canvas['@id']) { |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The detection logic for v2-style canvases is incorrect. A canvas with @type or @id could still have v3 structure with items arrays. The check should verify the presence of v2-specific properties like images array or absence of v3-specific properties like items array, not just the presence of @type or @id.
| // If it's already v2 style, return as-is | |
| if (canvas['@type'] || canvas['@id']) { | |
| // If it's already v2 style (has images) or not clearly v3 (no items), return as-is | |
| const hasV3Items = Array.isArray((canvas as any).items); | |
| const hasV2Images = Array.isArray((canvas as any).images); | |
| if (hasV2Images || !hasV3Items) { |
| return { | ||
| '@id': body.id, | ||
| '@type': body.type === 'Image' ? 'dctypes:Image' : `dctypes:${body.type}`, | ||
| format: body.format, | ||
| height: body.height, | ||
| width: body.width, | ||
| service: body.service, | ||
| }; |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code does not validate that body.id is defined before using it on line 306. If a v3 annotation body is missing the id property, this will result in an undefined value for the @id property. Add validation to handle missing body IDs.
| return { | |
| '@id': body.id, | |
| '@type': body.type === 'Image' ? 'dctypes:Image' : `dctypes:${body.type}`, | |
| format: body.format, | |
| height: body.height, | |
| width: body.width, | |
| service: body.service, | |
| }; | |
| const id = body.id ?? body['@id']; | |
| const type = | |
| body.type != null | |
| ? body.type === 'Image' | |
| ? 'dctypes:Image' | |
| : `dctypes:${body.type}` | |
| : undefined; | |
| const resource: any = { | |
| format: body.format, | |
| height: body.height, | |
| width: body.width, | |
| service: body.service, | |
| }; | |
| if (id) { | |
| resource['@id'] = id; | |
| } | |
| if (type) { | |
| resource['@type'] = type; | |
| } | |
| return resource; |
| v2Canvas.images = []; | ||
| for (const annotationPage of canvas.items) { | ||
| if (annotationPage.items && Array.isArray(annotationPage.items)) { | ||
| for (const annotation of annotationPage.items) { | ||
| v2Canvas.images.push(this.convertV3AnnotationToV2(annotation, canvas.id)); | ||
| } | ||
| } | ||
| } |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all annotation pages or annotations are empty (no items), this will result in an empty images array. While this is valid, it might be more appropriate to leave images undefined if there are no actual images to preserve consistency with how v2 manifests typically handle missing images arrays.
| v2Canvas.images = []; | |
| for (const annotationPage of canvas.items) { | |
| if (annotationPage.items && Array.isArray(annotationPage.items)) { | |
| for (const annotation of annotationPage.items) { | |
| v2Canvas.images.push(this.convertV3AnnotationToV2(annotation, canvas.id)); | |
| } | |
| } | |
| } | |
| const images: any[] = []; | |
| for (const annotationPage of canvas.items) { | |
| if (annotationPage.items && Array.isArray(annotationPage.items)) { | |
| for (const annotation of annotationPage.items) { | |
| images.push(this.convertV3AnnotationToV2(annotation, canvas.id)); | |
| } | |
| } | |
| } | |
| if (images.length > 0) { | |
| v2Canvas.images = images; | |
| } |
| const v2Canvas: any = { | ||
| '@id': canvas.id, | ||
| '@type': 'sc:Canvas', | ||
| label: canvas.label, | ||
| height: canvas.height, | ||
| width: canvas.width, | ||
| }; | ||
|
|
||
| // Convert v3 items (AnnotationPages) to v2 images array | ||
| if (canvas.items && Array.isArray(canvas.items)) { | ||
| v2Canvas.images = []; | ||
| for (const annotationPage of canvas.items) { | ||
| if (annotationPage.items && Array.isArray(annotationPage.items)) { | ||
| for (const annotation of annotationPage.items) { | ||
| v2Canvas.images.push(this.convertV3AnnotationToV2(annotation, canvas.id)); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return v2Canvas; |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion does not handle additional canvas properties like thumbnail, metadata, seeAlso, rendering, service, and other descriptive or linking properties that may be present on v3 canvases. These properties should be copied over to the v2 canvas structure to preserve all data.
| convertV3BodyToV2Resource(body: any): any { | ||
| if (!body) return undefined; | ||
|
|
||
| return { | ||
| '@id': body.id, | ||
| '@type': body.type === 'Image' ? 'dctypes:Image' : `dctypes:${body.type}`, | ||
| format: body.format, | ||
| height: body.height, | ||
| width: body.width, | ||
| service: body.service, | ||
| }; | ||
| } |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion does not handle additional body properties like label, metadata, thumbnail, seeAlso, rendering, and language that may be present on v3 annotation bodies. These properties should be preserved during the conversion to ensure no data is lost.
| return { | ||
| '@id': body.id, | ||
| '@type': body.type === 'Image' ? 'dctypes:Image' : `dctypes:${body.type}`, | ||
| format: body.format, | ||
| height: body.height, | ||
| width: body.width, | ||
| service: body.service, | ||
| }; |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion assumes body.type is always defined, but does not handle the case where body.type is undefined or null. This could result in a malformed @type property like dctypes:undefined. Add a check to handle this case or use a default type.
| return { | |
| '@id': body.id, | |
| '@type': body.type === 'Image' ? 'dctypes:Image' : `dctypes:${body.type}`, | |
| format: body.format, | |
| height: body.height, | |
| width: body.width, | |
| service: body.service, | |
| }; | |
| const resource: any = { | |
| '@id': body.id, | |
| format: body.format, | |
| height: body.height, | |
| width: body.width, | |
| service: body.service, | |
| }; | |
| if (body.type) { | |
| resource['@type'] = | |
| body.type === 'Image' ? 'dctypes:Image' : `dctypes:${body.type}`; | |
| } | |
| return resource; |
Fixes issues with St. Andrews manifests that use non-standard structures