This is a Trino plugin that provides a connector to read from and write to HTTP APIs given an OpenAPI specification.
To run a Docker container with the connector, run the following:
docker run \
-d \
--name trino-openapi \
-e OPENAPI_SPEC_LOCATION=http://api.example/api/spec.yaml \
-e OPENAPI_BASE_URI=http://api.example/api/v1 \
-p 8080:8080 \
nineinchnick/trino-openapi:1.66Then use your favourite SQL client to connect to Trino running at http://localhost:8080
Download one of the ZIP packages, unzip it and copy the trino-openapi-1.66
directory to the plugin directory on every node in your Trino cluster. Create a
openapi.properties file in your Trino catalog directory and set all the
required properties.
connector.name=openapi
spec-location=http://api.example/api/spec.yaml
base-uri=http://api.example/api/v1
authentication.type=http
authentication.username=${ENV:OPENAPI_USERNAME}
authentication.password=${ENV:OPENAPI_PASSWORD}
After reloading Trino, you should be able to connect to the openapi catalog.
| Configuration property | Container environmental variable | Description |
|---|---|---|
| spec-location | OPENAPI_SPEC_LOCATION |
URL or filename containing the OpenAPI specification, either JSON or YAML |
| base-uri | OPENAPI_BASE_URI |
Base URL for the API, often includes API version |
| authentication.type | OPENAPI_AUTH_TYPE |
Default authentication type if not set in the specification. One of: none, http, api_key, oauth. |
| authentication.scheme | OPENAPI_AUTH_SCHEME |
Authentication scheme for the http authentication type. One of: basic, bearer. |
| authentication.token-endpoint | OPENAPI_TOKEN_ENDPOINT |
OAuth token endpoint URL |
| authentication.client-id | OPENAPI_CLIENT_ID |
OAuth Client ID |
| authentication.client-secret | OPENAPI_CLIENT_SECRET |
OAuth Client secret |
| authentication.grant-type | OPENAPI_GRANT_TYPE |
OAuth grant type |
| authentication.username | OPENAPI_USERNAME |
Username used for the http and oauth authentication types |
| authentication.password | OPENAPI_PASSWORD |
Password used for the http and oauth authentication types |
| authentication.bearer-token | OPENAPI_BEARER_TOKEN |
Bearer token for http authentication |
| authentication.api-key-name | OPENAPI_API_KEY_NAME |
API key name |
| authentication.api-key-value | OPENAPI_API_KEY_VALUE |
API key value |
| max-requests-per-second | OPENAPI_MAX_REQUESTS_PER_SECOND |
Maximum HTTP requests per second executed from a single Trino node |
| max-splits-per-second | OPENAPI_MAX_SPLITS_PER_SECOND |
Maximum number of splits per second generated when executing a query |
| domain-expansion-limit | OPENAPI_DOMAIN_EXPANSION_LIMIT |
Maximum number of discrete domain values generated from range predicates, which correlates with the number of HTTP requests executed |
The connector is using the Airlift HTTP client, which can be configured with
additional
properties
prefixed with openApi, like so:
openApi.http-client.log.enabled=true
openApi.http-client.log.path=logs
This connector has been tested with the following APIs. The Docker image includes additional catalog files that will be used if the first of their environmental variable is set.
| Name | Environmental variables |
|---|---|
| OpenAPI Petstore | PETSTORE_URL |
| Starburst Galaxy | GALAXY_URL, GALAXY_TOKEN |
| Jira Cloud platform | JIRA_URL |
| GitHub REST API | GITHUB_TOKEN |
| DataDog API | DATADOG_URL |
The following rules are used to map the OpenAPI schema to SQL tables and columns:
- Paths are mapped to tables; table names are paths, with
camelCaseconverted tosnake_caseand special characters (like/) converted to underscores. - SQL operations are mapped to HTTP methods:
- SELECT uses GET or POST, if no GET is available;
- INSERT uses POST or PUT;
- UPDATE uses PATCH or POST;
- DELETE uses DELETE.
- All request parameters are mapped to columns, including path, query, and header parameters.
- Fields of the HTTP OK (200) response type are mapped to columns.
- Fields of the request body are also mapped to columns, if a POST request is used for SELECT.
- All columns are disambiguated - if a field with the same name but a different
data type appears in the response and parameters, it'll be mapped to multiple
columns, with numeric suffixes (
_2,_3, etc.). Request body fields have a_reqsuffix.
OpenAPI allows using custom extensions - adding custom fields anywhere in the
schema as long as they're prefixed with an X. Such custom extensions are used
to fine-tune this connector.
If the original service cannot be modified to include an extension in a generated OpenAPI schema, save it locally and modify as needed.
APIs can use 4 different types of pagination:
- Offset - every response can include an offset parameter, telling how many results to skip.
- Page - every response can include a page parameter; the number of results per page can be configurable with another parameter.
- Cursor/token - the response includes the value of a cursor or token, that needs to be included in the next request.
- Next page URL - the response includes an URL of the next set of results.
The connector currently supports only the Page pagination.
To enable pagination, add a x-pagination section in the path's operation section:
paths:
/records:
get:
responses:
# ...
x-pagination:
pageParam: "page"
limitParam: "per-page"
resultsPath: "$response.body#/workflows"
totalResultsPath: "$response.body#/total_count"
Run all the unit tests:
mvn testCreates a deployable zip file:
mvn clean packageUnzip the archive from the target directory to use the connector in your Trino cluster.
unzip target/*.zip -d ${PLUGIN_DIRECTORY}/
mv ${PLUGIN_DIRECTORY}/trino-openapi-* ${PLUGIN_DIRECTORY}/trino-openapiTo test and debug the connector locally, run the OpenApiQueryRunner class
located in tests:
mvn test-compile exec:java -Dexec.mainClass="pl.net.was.OpenApiQueryRunner" -Dexec.classpathScope=testAnd then run the Trino CLI using trino --server localhost:8080 --no-progress
and query it:
trino> show catalogs;
Catalog
---------
openapi
system
(2 rows)
trino> show tables from openapi.default;
Table
------------
single_row
(1 row)
trino> select * from openapi.default.single_row;
id | type | name
----+---------------+---------
x | default-value | my-name
(1 row)