Skip to content

feat: add model registry client#186

Merged
google-oss-prow[bot] merged 1 commit intokubeflow:mainfrom
jonburdo:kubeflow-hub-api
Jan 21, 2026
Merged

feat: add model registry client#186
google-oss-prow[bot] merged 1 commit intokubeflow:mainfrom
jonburdo:kubeflow-hub-api

Conversation

@jonburdo
Copy link
Member

@jonburdo jonburdo commented Dec 2, 2025

Add an api for model registry under a new kubeflow.hub submodule.

This takes the form of a ModelRegistryClient class which provides a kubeflow-style api for model-registry.

🤖 Generated with Claude Code

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):

Fixes #59

Checklist:

  • Docs included if any changes are user facing

details

This provides a possible api following this discussion. Also see the linked issue and the following:

@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2025

🎉 Welcome to the Kubeflow SDK! 🎉

Thanks for opening your first PR! We're happy to have you as part of our community 🚀

Here's what happens next:

  • If you haven't already, please check out our Contributing Guide for repo-specific guidelines and the Kubeflow Contributor Guide for general community standards
  • Our team will review your PR soon! cc @kubeflow/kubeflow-sdk-team

Join the community:

Feel free to ask questions in the comments if you need any help or clarification!
Thanks again for contributing to Kubeflow! 🙏

@coveralls
Copy link

coveralls commented Dec 2, 2025

Pull Request Test Coverage Report for Build 21182264396

Details

  • 172 of 205 (83.9%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.9%) to 67.52%

Changes Missing Coverage Covered Lines Changed/Added Lines %
kubeflow/hub/api/model_registry_client.py 47 51 92.16%
kubeflow/hub/api/model_registry_client_test.py 123 152 80.92%
Totals Coverage Status
Change from base Build 21147474427: 0.9%
Covered Lines: 2769
Relevant Lines: 4101

💛 - Coveralls

@jonburdo jonburdo changed the title feat(hub): add model registry client feat: add model registry client Dec 2, 2025
@jonburdo jonburdo marked this pull request as draft December 2, 2025 01:47
@jonburdo jonburdo force-pushed the kubeflow-hub-api branch 2 times, most recently from fd5f82f to 33efe23 Compare December 3, 2025 00:21
Copy link
Contributor

@kramaranya kramaranya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this amazing work, @jonburdo!
Overall looks good, I left my initial comments
/assign @kubeflow/kubeflow-sdk-team

ci
docs
examples
hub
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm still a bit confused with hub vs model registry, @johnugeorge would you mind elaborating on this, specifically what do we expose to Kubeflow SDK users?
Shall this be model-registry or mr instead, since we only include only ModelRegistryClient?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking kubeflow.hub is the appropriate submodule given:

  • We plan to rename from "Kubeflow Model Registry" to "Kubeflow Hub" assuming KEP-907 goes through.
  • The scope of the project is already broader than just model registry and it is conceivable that we could offer other clients in the future.

To highlight the point, we already have a model catalog and some basic experiment tracking. Now we don't plan to have kubeflow sdk clients for these afaik, but we are positioned to be able to extend the project for other AI/ML asset types (mcp, prompt management). We also have some storage helpers and integrations as outlined here.

Note my assumption here is basically that we want to have roughly one submodule per kubeflow sub-project. We could decide not to do this and then have kubeflow.model_registry and add separate submodules for any new clients or functionality we add later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks @jonburdo!

Comment on lines +42 to +44
hub = [
"model-registry>=0.3.0",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should include model-registry by default. wdyt @kubeflow/kubeflow-sdk-team
The issue is that Model Registry and other components already have their own extras, for example https://github.com/kubeflow/model-registry/blob/main/clients/python/pyproject.toml#L33-L36. We can't do "extras of extras", so if we tried to consolidate them we'd lose that functionality.
I think we should include core Kubeflow components by default (including model-registry) and reserve extras for third-party integrations like OpenTelemetry, MLflow, Hugging Face, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the takeaway in the Kubeflow SDK & ML Experience WG call was that we will keep model-registry optional for now. I'm thinking that means we leave this as is - lmk if not.

A few other things to note:

  • The subset of model registry functions we are exposing here does not rely on the model-registry extras if I'm not mistaken
  • This doesn't prevent users from doing pip install 'model-registry[hf,boto3,olot]' if needed
  • Other options could include having kubeflow[hub] for model-registry and kubeflow[hub-all] for model-registry[hf,boto3,olot] if we really needed it (but I don't think we do)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's keep the hub extras for now as we discussed. Later we can figure out a way how to manage extras of extras.


[dependency-groups]
dev = [
"kubeflow[hub]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"kubeflow[hub]",
"model-registry>=0.3.0",

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just want to include anything in the hub extra rather than re-listing the packages.

Copy link
Member Author

@jonburdo jonburdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some updates and simplifications since the first iteration:

  • simplify __init__ and register_model
  • remove from_registry classmethod - We can have something that does this in ModelRegistry instead of here I think
  • return ModelVersion instead of ModelVersion | None from get_model_version - same for other get_* functions. Error if missing

Makefile Outdated
@uv lock --check
@uv run ruff check --show-fixes --output-format=github .
@uv run ruff format --check kubeflow
@uv run mypy kubeflow/hub --exclude '_test\.py$$'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added mypy - for now just on kubeflow/hub. It would simplify linting if we used the typical pattern of a separate test dir in the root of the repo for tests instead of mixing in *_test.py files, but I stuck with the existing pattern for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use ruff for linting ? Using unified tool will allow us to reduce number of dependencies users have to install.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using ruff for linting, but it doesn't include type-checking, which we really want, so we also need mypy (e.g. if we're passing an int to str arg ruff won't catch that). The same org behind ruff is working on a type checker, ty, but it's relatively new and would still be a separate tool: https://github.com/astral-sh/ty

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kramaranya @szaher @astefanutti What are your thoughts on this? Shall we adopt ty for type checking?
@HumairAK @mprahl @hbelmiro @nsingla Are you planning to use any typing tool for Pipelines library?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonburdo have you tried to compare ty and mypy? does it cover everything you're using mypy for?

Copy link

@hbelmiro hbelmiro Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyvelich No concrete plans at the moment. For the components/pipelines themselves, KFP doesn't really enforce or leverage Python type hints.
Adding a type checker for our scripts/CLIs could still be valuable. Though I'm guessing that's not the part you're asking about.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking again at ty, I think it's a good option. It will be faster and align with our use of uv and ruff. Replacing mypy with ty

@jonburdo jonburdo force-pushed the kubeflow-hub-api branch 2 times, most recently from 1218458 to affcdc5 Compare December 4, 2025 16:31
@jonburdo
Copy link
Member Author

jonburdo commented Dec 4, 2025

I've addressed initial feedback and believe this is ready for review by @kubeflow/kubeflow-sdk-team

@jonburdo jonburdo force-pushed the kubeflow-hub-api branch 2 times, most recently from 86564f2 to 854e6ba Compare December 10, 2025 13:02
Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this @jonburdo! Overall, lgtm, I left a few comments.
cc @kubeflow/kubeflow-sdk-team

Makefile Outdated
@uv lock --check
@uv run ruff check --show-fixes --output-format=github .
@uv run ruff format --check kubeflow
@uv run mypy kubeflow/hub --exclude '_test\.py$$'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use ruff for linting ? Using unified tool will allow us to reduce number of dependencies users have to install.

```

```python
from kubeflow.hub import ModelRegistryClient
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to have 2 clients in Hub? ModelRegistry and Catalog.

Copy link
Member

@tarilabs tarilabs Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have been discussing in KF Hub biweekly meeting that any new user facing client will be developed in KF SDK.
👉 So when the user need comes for a user facing py client for the Catalog, we'd directly develop this in KF SDK.
Hope this is satisfactory answer for ML WG (this group) and from KSC pov.

Detail note: from a E2E testing point of view, we need python based code generated clients for all the endpoints, which is something that for example ML WG (this group) mentioned is not interested in exposing, etc. So infrastructure code we need for the purpose of E2E, fuzz, and coverage testing of the endpoints, we're continuing maintaining in the KF/model-registry repo. We're open to revisit this setup too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you for sharing recording @tarilabs!

we need python based code generated clients for all the endpoints, which is something that for example ML WG (this group) mentioned is not interested in exposing, etc.

Yeah, that make sense. I guess, the conclusion was that we expose minimal set of APIs via kubeflow.hub package, and gather initial users feedback.
If users need other APIs that we don't expose via kubeflow.hub, they can always use it via this parameter:

self._registry()

Or import the client directly import model_registry.
In the future, we can potentially add more APIs to the kubeflow.hub directly.

The only question that I have, do users need two separate clients for Catalog and Model Registry APIs inside the kubeflow.hub sub package ?
e.g.:

from kubeflow.hub import ModelRegistryClient
from kubeflow.hub import CatalogClient
or just
from kubeflow.hub import HubClient

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you for sharing recording @tarilabs!

we need python based code generated clients for all the endpoints, which is something that for example ML WG (this group) mentioned is not interested in exposing, etc.

Yeah, that make sense. I guess, the conclusion was that we expose minimal set of APIs via kubeflow.hub package, and gather initial users feedback. If users need other APIs that we don't expose via kubeflow.hub, they can always use it via this parameter:

self._registry()

Or import the client directly import model_registry. In the future, we can potentially add more APIs to the kubeflow.hub directly.

Great, sounds we are all aligned, personally glad and great to hear 👍

As for the latter question, my current assessment is:

from kubeflow.hub import ModelRegistryClient
from kubeflow.hub import CatalogClient

as they're different patterns, and in turn different use-cases.
However as we'll define it in more concrete terms, we can refine the case more specifically then.
Hope this is understandable and fair.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that sounds great, thank you!

Comment on lines +42 to +44
hub = [
"model-registry>=0.3.0",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's keep the hub extras for now as we discussed. Later we can figure out a way how to manage extras of extras.

"""Skip these tests if model-registry not installed."""
pytest.importorskip("model_registry")

def test_init_raises_helpful_error_when_not_installed(self, monkeypatch):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you refactor these tests to align with this:

def test_get_runtime(kubernetes_backend, test_case):

E.g. every test should align with the function name: test_register_model(test_case) and we create test cases via :

@pytest.mark.parametrize(
    "test_case",
    [
        TestCase()
    ]

That will allow contributors to easily understand code and help with changes for Hub SDKs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've made this change. This does we are importing TestCase from kubeflow.trainer. I also have mixed feelings about this, but if it's something we want to standardize in the sdk, then I guess it works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Yes, eventually, we should have common types/helper function that we can re-use across packages.
This is yet another benefit to use common SDK to reduce code duplication and standardize on the code practices.

from collections.abc import Iterator, Mapping
from typing import TYPE_CHECKING

if TYPE_CHECKING:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need type checking here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want type annotations on the functions for better type-safety. LSPs will use them too (docs/checks in code editors). The if TYPE_CHECKING allows us to import things we may not have a runtime. mypy and LSPs will use the imports but if model-registry is missing we don't get an import error at this point.

Since hub is an optional dependency, we check for model-registry upon usage of the client ModelRegistryClient() rather than upon import as discussed:
#186 (comment)

Otherwise in a new env, this would error:

pip install kubeflow
python -c 'import kubeflow.hub'

- http:// defaults to 8080
- no scheme defaults to 443
author: Name of the author.
is_secure: Whether to use a secure connection. If not provided, inferred from base_url:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question, why not check it from the URL directly ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#186 (comment)

is_secure is needed when you want to skip TLS verification, ie you have self-signed certs in a local development environment

We will default to is_secure=True for https and is_secure=False for http, but I believe we need to allow users to disable it too (is_secure=False with https) for development (shouldn't be done in production).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is only for development why not use http? Did you have use-cases when users want to verify https locally without secure connection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in community call today. @tarilabs noted that this is important for certain use-cases. Users may use this library for development and have a need to use unsigned certificates.

Fwiw, we do leave this out of our primary quick-start examples in the README

self._registry = ModelRegistry(
server_address=base_url,
port=port,
author=author, # type: ignore[arg-type]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since author is part of register_model API, does client still require it:

author: str | None = None,
?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This provides a default value so it doesn't have to be passed to register_model every time it's called.

name: Name of the model.
uri: URI of the model.

Keyword Args:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kramaranya Do you know if we can auto-generate Kubeflow SDK docs from such format of docstring: Keyword Args?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it's configurable. I'll soon open a PR for the website and will verify it

name: str,
uri: str,
*,
version: str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since version is mandatory parameter why you put it after "*" as keyword args?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It matches what we have in ModelRegistry and provides a little more flexibility if we want to re-order args after name, uri. I do wonder if we want to make version optional in the future and support an autogenerated version string if not provided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, this is good idea. I would suggest to make it Optional.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in the community meeting. I believe we agreed to leave it as is for now. Apparently it becomes complicated if we try to provide a default as users won't always agree on if it should be uuid or incrementing v1, v2, ...

TypeError: If model is not a RegisteredModel instance.
model_registry.exceptions.StoreError: If model does not have an ID.
"""
from model_registry.types import RegisteredModel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move all imports to the top of this file?

Copy link
Member Author

@jonburdo jonburdo Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would cause an immediate error upon import for import kubeflow.hub if the user has done:

pip install kubeflow

but not

pip install 'kubeflow[hub]'

We could opt for that. Or we could do a lazy import. I would suggest leaving it as-is though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But since we put it under TYPE_CHECKING that should be fine, isn't?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this function if not isinstance(model, RegisteredModel): needs to have RegisteredModel at runtime.

For example, if you use from __future__ import annotations or python>=3.10 then you can use:

if TYPE_CHECKING:
    ... import NonRuntimeVar

my_var: NonRuntimeVar = ...

mypy and ty will still analyze use of NonRuntimeVar here, but you couldn't actually do:

NonRuntimeVar()
# or
isinstance(something, NonRuntimeVar)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, thanks for clarifying!

Copy link
Member Author

@jonburdo jonburdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andreyvelich for the review and all the thoughtful feedback! I believe I've addressed it. The one thing I'm not sure about is the question about docs: #186 (comment)

Makefile Outdated
@uv lock --check
@uv run ruff check --show-fixes --output-format=github .
@uv run ruff format --check kubeflow
@uv run mypy kubeflow/hub --exclude '_test\.py$$'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using ruff for linting, but it doesn't include type-checking, which we really want, so we also need mypy (e.g. if we're passing an int to str arg ruff won't catch that). The same org behind ruff is working on a type checker, ty, but it's relatively new and would still be a separate tool: https://github.com/astral-sh/ty

from collections.abc import Iterator, Mapping
from typing import TYPE_CHECKING

if TYPE_CHECKING:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want type annotations on the functions for better type-safety. LSPs will use them too (docs/checks in code editors). The if TYPE_CHECKING allows us to import things we may not have a runtime. mypy and LSPs will use the imports but if model-registry is missing we don't get an import error at this point.

Since hub is an optional dependency, we check for model-registry upon usage of the client ModelRegistryClient() rather than upon import as discussed:
#186 (comment)

Otherwise in a new env, this would error:

pip install kubeflow
python -c 'import kubeflow.hub'

def __init__(
self,
base_url: str,
port: int | None = None,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added from __future__ import annotations at the top in order to allow postponed evaluation of annotations. This allows things like:

  • | for union
  • missing-at-runtime variables like return type -> RegisteredModel instead of -> "RegisteredModel"

I've double checked this works on python 3.9 by ensuring from kubeflow.hub import ModelRegistryClient works as-is. Removing the from __future__ import annotations would result in:

TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'

The scheme is used to infer is_secure and port if not explicitly provided.

Keyword Args:
port: Server port. If not provided, inferred from base_url scheme:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Maybe we can do this as a follow-up?

I'm thinking we should keep the option to use the separate port arg, but if it's not used, we should also allow users to pass it via base_url, so these are all valid and equivalent:

ModelRegistryClient(base_url="https://example.com")
ModelRegistryClient(base_url="https://example.com", port=443)
ModelRegistryClient(base_url="https://example.com:443")

This can also be a change made in the base developer client (ModelRegistry from model-registry package) instead of here.

- http:// defaults to 8080
- no scheme defaults to 443
author: Name of the author.
is_secure: Whether to use a secure connection. If not provided, inferred from base_url:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#186 (comment)

is_secure is needed when you want to skip TLS verification, ie you have self-signed certs in a local development environment

We will default to is_secure=True for https and is_secure=False for http, but I believe we need to allow users to disable it too (is_secure=False with https) for development (shouldn't be done in production).

self._registry = ModelRegistry(
server_address=base_url,
port=port,
author=author, # type: ignore[arg-type]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This provides a default value so it doesn't have to be passed to register_model every time it's called.

name: str,
uri: str,
*,
version: str,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It matches what we have in ModelRegistry and provides a little more flexibility if we want to re-order args after name, uri. I do wonder if we want to make version optional in the future and support an autogenerated version string if not provided.

TypeError: If model is not a RegisteredModel instance.
model_registry.exceptions.StoreError: If model does not have an ID.
"""
from model_registry.types import RegisteredModel
Copy link
Member Author

@jonburdo jonburdo Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would cause an immediate error upon import for import kubeflow.hub if the user has done:

pip install kubeflow

but not

pip install 'kubeflow[hub]'

We could opt for that. Or we could do a lazy import. I would suggest leaving it as-is though.

"""Skip these tests if model-registry not installed."""
pytest.importorskip("model_registry")

def test_init_raises_helpful_error_when_not_installed(self, monkeypatch):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've made this change. This does we are importing TestCase from kubeflow.trainer. I also have mixed feelings about this, but if it's something we want to standardize in the sdk, then I guess it works.

Copy link
Contributor

@kramaranya kramaranya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @jonburdo!
Overall looks great to me 👍

ci
docs
examples
hub
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks @jonburdo!

Makefile Outdated
@uv lock --check
@uv run ruff check --show-fixes --output-format=github .
@uv run ruff format --check kubeflow
@uv run mypy kubeflow/hub --exclude '_test\.py$$'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonburdo have you tried to compare ty and mypy? does it cover everything you're using mypy for?

@@ -0,0 +1,5 @@
"""Kubeflow Hub API."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add copyright header please to all new files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Added

The scheme is used to infer is_secure and port if not explicitly provided.

Keyword Args:
port: Server port. If not provided, inferred from base_url scheme:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @andreyvelich, this will keep our APIs consistent for users

name: Name of the model.
uri: URI of the model.

Keyword Args:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it's configurable. I'll soon open a PR for the website and will verify it

Comment on lines +245 to +251
def list_models(self) -> Iterator[RegisteredModel]:
"""Get an iterator for registered models.

Yields:
Registered models.
"""
yield from self._registry.get_registered_models()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We return an Iterator in Model Registry with list_models, however, we return lists for both OptimizerClient and TrainerClient in APIs like list_jobs().
@andreyvelich @jonburdo to provide consistent APIs to our users, shall we consider aligning OptimizerClient and TrainerClient with ModelRegistryClient and return an Iterator, alternatively we can return list in ModelRegistryClient?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like returning the list might be a good option, with future improvements to return Iterator if users will request it.

@tarilabs @jonburdo Do you see a lot of use-cases when this doesn't work?
I don't expect that users manage more than millions of models in their registries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in the community call today. We definitely want to keep this as a generator.

We use pagination, so eagerly fetching and returning a list would make extra API calls, and it is certainly possible to have enough entries for significant overhead.

I'm not sure I'd consider it real inconsistency if some functions in a library return a list and some return a generator - especially if it's done in a logical way based on amount of data or usefulness of evaluating lazily.

@jonburdo
Copy link
Member Author

Okay, I've rebased and updated the code. I think I covered our discussion points!

Here is a succinct summary covering most of the points:

I changed:

  • replace mypy with ty (with python 3.10 as target version)
  • add missing copyright headers

I kept:

  • base_url (instead of url)
  • version arg (required kwarg)
  • docstrings as-is
  • return generator instead of list
  • is_secure bool

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @jonburdo!
I left a few comments, we should be good to go!
/assign @kramaranya @astefanutti @tarilabs @szaher in case you want to add more comments.


Args:
base_url: Base URL of the model registry server including scheme.
Examples: "https://registry.example.com", "http://localhost"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you add example with port as well? e.g. https://registry.example.com:8888

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to fix this in the developer (model-registry) client. I've created an issue for this: kubeflow/model-registry#2110

Otherwise we'll have url parsing here and then again there (in ModelRegistry)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good @jonburdo!
Please can you create an issue in kubeflow/sdk to also track this update once Model Registry client supports it upstream?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added here: #227

"test_case",
[
TestCase(
name="http URL infers port=8080 and is_secure=False",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test case when port is taken from the base_url?

"base_url": "http://localhost:1234",

),
],
)
def test_get_model(test_case, client, mock_registry):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add tests for test_get_model_version and test_get_model_artifact ?

),
],
)
def test_update_model_type_check(test_case, client):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add all of these test cases to the test_update_model ? Similar to other tests, we can have one function with test cases for each public API we expose.

),
],
)
def test_update_model_version_type_check(test_case, client):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same suggestion to move them to the test_update_model_version

),
],
)
def test_update_model_artifact_type_check(test_case, client):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

```

### Manage models with Model Registry

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also point users to the Model Registry documentation which says how to install control plane: https://www.kubeflow.org/docs/components/model-registry/installation/#standalone-installation

In the future, we might also want to introduce public ConfigMap check that verifies that Kubeflow Hub control plane is available on the cluster: #221
We can do this after the initial release.

@jonburdo jonburdo force-pushed the kubeflow-hub-api branch 2 times, most recently from 9179042 to 4816b1d Compare January 15, 2026 21:45
Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @jonburdo!
/lgtm
/cc @kramaranya in case you want to add more comments.
As we discussed, we will merge this to the main branch once we release Kubeflow SDK 0.3
/hold

Add an api for model registry under a new kubeflow.hub submodule.

This takes the form of a ModelRegistryClient class and provides a
user-facing api in a style similar to existing Kubeflow SDKs.

AI-assisted: Claude Code 2.0.58 (claude-sonnet-4-5@20250929)
Signed-off-by: Jon Burdo <jon@jonburdo.com>
@jonburdo
Copy link
Member Author

I've rebased to fix the merge conflicts. To see the conflict resolution changes only, you can use:

git range-diff 4816b1d...db12a29

It's just a slight change to pyproject.toml and changes to uv.lock

With the release of Kubeflow SDK 0.3 are we ready to merge this @andreyvelich ?

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this @jonburdo!
/lgtm
/assign @kubeflow/kubeflow-sdk-team @kramaranya

@google-oss-prow google-oss-prow bot added the lgtm label Jan 20, 2026
Copy link
Contributor

@kramaranya kramaranya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for this amazing contribution, @jonburdo!
/lgtm

@kramaranya
Copy link
Contributor

@jonburdo feel free to add OWNERS file to https://github.com/jonburdo/sdk/tree/kubeflow-hub-api/kubeflow/hub with Model Registry WG.
cc @andreyvelich

@andreyvelich
Copy link
Member

/approve
@kubeflow/kubeflow-hub-team Feel free to create followup PR to add OWNERs to the kubeflow.hub sub-package.

@google-oss-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@andreyvelich
Copy link
Member

/milestone v0.4

@google-oss-prow google-oss-prow bot added this to the v0.4 milestone Jan 21, 2026
@jonburdo
Copy link
Member Author

/unhold

@google-oss-prow google-oss-prow bot merged commit 716448f into kubeflow:main Jan 21, 2026
14 checks passed
@jonburdo jonburdo deleted the kubeflow-hub-api branch January 21, 2026 16:10
@andreyvelich andreyvelich mentioned this pull request Feb 18, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Model Registry Client to Kubeflow SDK

7 participants