Skip to content

Conversation

@rwb27
Copy link
Collaborator

@rwb27 rwb27 commented Jan 9, 2026

This PR makes use of the changes in #242 to serialise Blobs more neatly. It contains those commits and should be merged afterwards.

I refactored Blob more significantly, making a number of under-the-hood changes that I think make it much clearer. One of the early commits includes a working version of Blob that uses url_for but hasn't yet been refactored. However, I don't think there's much point reviewing two versions of the same thing, hence not splitting the PR.

  • Blob is no longer a BaseModel subclass. Instead, it's a regular class that also works as a pydantic type, just like URLFor.
  • Blob may now refer to local (bytes or file) or remote (URL) data, with BlobData subclasses for each.
  • ClientBlobOutput is gone, we can use Blob instead. That eliminates a whole module and is a step towards client and server types matching.
  • I've abandoned the use of Protocol in favour of base classes - I don't see many (if any) reasons to add new BlobData types, and if we do I don't see any reason they can't also inherit from BlobData.
  • Blob objects now serialise properly when returned from any endpoint in the server. Serialising them in other contexts requires the use of labthings_fastapi.testing.use_dummy_url_for.

I've also removed the need to pass request to Invocations when generating their response - we can use URLFor to generate the URLs that need to show up in the output.

This ended up being a larger change than I intended, but I think it results in a cleaner structure, and provides pydantic functionality in a way that doesn't mess up the structure of classes that client code needs to use.

rwb27 added 9 commits January 7, 2026 23:58
This adds:

* a middleware function that pushes the `url_for` method of the `Request` object to a context variable.
* a `pydantic`-compatible class that calls `url_for` to generate URLs at serialisation time.
* a test harness that allows this to be used outside of an HTTP request handler.

The middleware is tested on its own and in the context of a FastAPI app, but isolated from the rest of LabThings.
This commit makes use of the new `url_for` middleware to eliminate the Blob-specific context variables.

BlobData objects are now added to a singleton BlobManager when they are created, and the URL is filled in at serialisation time.

This is a slight simplification of the old behaviour, but it's equivalent in all the ways
that matter.
Having now learned more about custom types in pydantic, I've done some more tidying here:

* Blob is no longer a BaseModel subclass. I've separated out the model (used for serialisation/validation) and the class that user code will interact with.
* BlobData is now a base class not a protocol, and there's a subclass for remote blob data that downloads on demand.

This removes most of the complicated logic from `Blob` around when we do and don't need a `BlobData`: a `Blob` is **always** backed by `BlobData` whether it's local or remote. This also means we can get rid of `ClientBlobOutput` and just use `Blob` instead.
This now correctly tells clients the media type, and uses a descriptive title. I believe it's now at least as good as the old schema.
We can now use one Blob class for client and server :)

I realised we had the potential to have inconsistencies between BlobData and the host Blob in the media type.

We now check the types match, and allow the BlobData to override the Blob's default if it's a matching but more specific type.

I've also take a pass through the blob documentation to update it where needed. Happily, as this PR only touches implementation details,
not much has changed.
I got rid of the conversion of "*" to None, I think it's clearer this way.

I also fixed a typo and ignored a codespell false positive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants