Add community extension: Catalogs Endpoint#7
Add community extension: Catalogs Endpoint#7emmanuelmathot merged 4 commits intostac-api-extensions:mainfrom
Conversation
|
@m-mohr Thanks for the rigorous debate today - mostly here. Status Update: Renaming: I have renamed the extension title to "Catalogs Endpoint" to address your concern about naming ambiguity and to be purely descriptive of the route being added. Implementation: We are proceeding with this implementation in stac-fastapi-elasticsearch-opensearch. It solves issues for some of our users regarding aggregations, strict typing, and multi-tenant routing that the current specs do not address. It is an optional feature that is turned off by default. Future Work: We appreciate the invite to collaborate on the Children extension and will look into contributing to that spec separately to improve client traversal. Since our PR is simply to add an entry to the Community Extensions (external) list to help users find existing tools, I will leave it open for you to merge at your discretion. |
|
It's not me who decides this on my own, it's the STAC PSC. We'll get back to this. Edit: Related proposals can be found here: stac-api-extensions/children#6 and radiantearth/stac-spec#1374 |
**Related Issue(s):** - #520 - #308 - radiantearth/stac-api-spec#239 - radiantearth/stac-api-spec#329 - stac-api-extensions/stac-api-extensions.github.io#7 - https://github.com/Healy-Hyperspatial/stac-api-extensions-catalogs #### Description This PR introduces the **Catalogs Extension**, enabling a federated "Hub and Spoke" architecture within `stac-fastapi`. Currently, the API assumes a single Root Catalog containing a flat list of Collections. This works for simple deployments but becomes unwieldy for large-scale implementations aggregating multiple providers, missions, or projects. This change adds a `/catalogs` endpoint that acts as a **Registry**, allowing the API to serve multiple distinct sub-catalogs from a single infrastructure. #### Key Features * **New Endpoints:** Implements the full suite of hierarchical endpoints: * `GET /catalogs` (List all sub-catalogs) * `POST /catalogs` (Create new sub-catalog) * `DELETE /catalogs/{catalog_id}` (Delete a catalog (supports ?cascade=true to delete child collections)) * `GET /catalogs/{catalog_id}` (Sub-catalog Landing Page) * `GET /catalogs/{catalog_id}/collections` (Scoped collections) * `POST /catalogs/{catalog_id}/collections` (Create a new collection directly linked to a specific catalog) * `GET /catalogs/{catalog_id}/collections/{collection_id}` (Get one collection) * `GET /catalogs/{catalog_id}/collections/{collection_id}/items` (Scoped item search) * `GET /catalogs/{catalog_id}/collections/{collection_id}/items/{item_id}` (Get one item) * **Serialization:** Updates Pydantic models and serializers to support `type: "Catalog"` objects within the API tree (previously restricted to Collections). * **Configuration:** Controlled via `ENABLE_CATALOGS_ROUTE` environment variable (default: `false`). #### Storage Strategy (Non-Breaking) To ensure **zero breaking changes** and avoid complex database migrations, this implementation stores `Catalog` objects within the existing `collections` index. * **Differentiation:** Objects are distinguished using the `type` field (`type: "Catalog"` vs. `type: "Collection"`). * **Backward Compatibility:** Existing queries for Collections remain unaffected as they continue to function on the same index structure. * **No Overhead:** No new Elasticsearch/OpenSearch indices or infrastructure changes are required to enable this feature. #### Architectural Alignment This implementation follows the proposed **[STAC API Catalogs Endpoint Extension](https://github.com/Healy-Hyperspatial/stac-api-extensions-catalogs-endpoint)** (Community Extension). It addresses the "Data Silo" problem by allowing organizations to host distinct catalogs on a single API instance, rather than deploying separate containers for every project or provider. #### Changes * `stac_fastapi/core/extensions/catalogs.py`: Added the main extension logic and router. * `stac_fastapi/core/models/`: Added `Catalog` Pydantic models. * `stac_fastapi/elasticsearch/database_logic.py`: Added CRUD logic filtering by `type: "Catalog"`. * `tests/`: Added comprehensive test suite (`test_catalogs.py`) covering CRUD operations and hierarchical navigation. **PR Checklist:** - [x] Code is formatted and linted (run `pre-commit run --all-files`) - [x] Tests pass (run `make test`) - [x] Documentation has been updated to reflect changes, if applicable - [x] Changes are added to the changelog
|
Some related discussion here fyi: radiantearth/stac-spec#1374 |
|
Thanks @jonhealy1 for the extension proposal and for addressing the naming concerns raised earlier. The STAC PSC has reviewed this and we're prepared to add it to the community extensions list with one modification: we'd like to request renaming it to "Multi-Tenant Catalogs Endpoint" to better clarify its specific functionality. The current name "Catalogs Endpoint" is quite generic and doesn't clearly communicate the extension's primary use case. Since the core functionality centers around multi-tenancy - enabling a single STAC API instance to serve multiple logical catalogs with shared collections and potentially separate authorization scheme - a more descriptive name would help users understand when this extension is applicable to their needs. Would you be able to make this naming change? Once updated, we can proceed with merging the PR. Thanks for your contribution. |
|
Hi Emmanuel, Thank you so much for the review and the green light! I really appreciate the PSC’s time on this. I am happy to rename it to be more descriptive, but I have a slight concern that "Multi-Tenant" might be too restrictive (and potentially misleading) given the extension's capabilities. While the extension can support multi-tenancy, a core feature is Poly-hierarchy - allowing a single Collection to be discoverable in multiple Catalogs simultaneously (e.g., a "Landsat" collection appearing in both a Provider/USGS catalog and a Theme/Forestry catalog). In many architectural contexts, "Multi-Tenant" implies strict data isolation (Silos), whereas this extension is designed for Virtual Organization (flexible logical views over shared data). I worry that naming it "Multi-Tenant" might discourage users who simply want to organize their archive into hierarchical folders/themes without implied security boundaries. Would the PSC be open to "Virtual Catalogs Endpoint"? I feel this captures the "logical" nature of the grouping (supporting both multi-tenancy and thematic organization) without pigeonholing it into a specific SaaS infrastructure pattern. Let me know if that works for the group! |
|
@emmanuelmathot Hi. The name |
|
Love it. I was about to respond that multi-tenant do not mean isolated and a lot of multi-tenant systems have shared resources. Thx. |
Adds a new community extension for the Catalogs Endpoint.
Repository: https://github.com/Healy-Hyperspatial/stac-api-extensions-catalogs-endpoint
This extension introduces a
/catalogsendpoint to enable a Virtual Organizational architecture. It serves as a registry for logical sub-catalogs, allowing users to organize data into flexible hierarchies (e.g., by theme, project, or SKOS concepts) without duplicating the underlying data.Key Capabilities:
Virtual Organization: Create logical groupings ("playlists") of collections based on semantics or themes.
Poly-hierarchy: Supports collections belonging to multiple catalogs simultaneously.
Safety-First Transactions: Defines transactional endpoints to create, update, and delete catalog structures. Crucially, these operations manage links, not data. Deleting a catalog via this endpoint strictly unlinks the contained collections, ensuring that the actual data (Collections/Items) is never accidentally destroyed.
We are currently looking at implementing this specification in the
stac-fastapi-elasticsearch-opensearchproject and are listing it here to allow for open-source collaboration.Context / Related Issues:
This extension addresses the need for multi-catalog aggregation discussed in the following issues: