Skip to content

Commit 6ebdfbc

Browse files
authored
Merge branch 'TobikoData:main' into feat.support_sr
2 parents d79dc21 + 1274484 commit 6ebdfbc

33 files changed

+259
-121
lines changed

.circleci/test_migration.sh

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,13 +24,14 @@ TEST_DIR="$TMP_DIR/$EXAMPLE_NAME"
2424

2525
echo "Running migration test for '$EXAMPLE_NAME' in '$TEST_DIR' for example project '$EXAMPLE_DIR' using options '$SQLMESH_OPTS'"
2626

27+
# Copy the example project from the *current* checkout so it's stable across old/new SQLMesh versions
28+
cp -r "$EXAMPLE_DIR" "$TEST_DIR"
29+
2730
git checkout $LAST_TAG
2831

2932
# Install dependencies from the previous release.
3033
make install-dev
3134

32-
cp -r $EXAMPLE_DIR $TEST_DIR
33-
3435
# this is only needed temporarily until the released tag for $LAST_TAG includes this config
3536
if [ "$EXAMPLE_NAME" == "sushi_dbt" ]; then
3637
echo 'migration_test_config = sqlmesh_config(Path(__file__).parent, dbt_target_name="duckdb")' >> $TEST_DIR/config.py
@@ -53,4 +54,4 @@ make install-dev
5354
pushd $TEST_DIR
5455
sqlmesh $SQLMESH_OPTS migrate
5556
sqlmesh $SQLMESH_OPTS diff prod
56-
popd
57+
popd

docs/integrations/engines/bigquery.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,23 @@ If the `impersonated_service_account` argument is set, SQLMesh will:
193193

194194
The user account must have [sufficient permissions to impersonate the service account](https://cloud.google.com/docs/authentication/use-service-account-impersonation).
195195

196+
## Query Label
197+
198+
BigQuery supports a `query_label` session variable which is attached to query jobs and can be used for auditing / attribution.
199+
200+
SQLMesh supports setting it via `session_properties.query_label` on a model, as an array (or tuple) of key/value tuples.
201+
202+
Example:
203+
```sql
204+
MODEL (
205+
name my_project.my_dataset.my_model,
206+
dialect 'bigquery',
207+
session_properties (
208+
query_label = [('team', 'data_platform'), ('env', 'prod')]
209+
)
210+
);
211+
```
212+
196213
## Permissions Required
197214
With any of the above connection methods, ensure these BigQuery permissions are enabled to allow SQLMesh to work correctly.
198215

docs/integrations/engines/trino.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,7 @@ hive.metastore.glue.default-warehouse-dir=s3://my-bucket/
9090
| `http_scheme` | The HTTP scheme to use when connecting to your cluster. By default, it's `https` and can only be `http` for no-auth or basic auth. | string | N |
9191
| `port` | The port to connect to your cluster. By default, it's `443` for `https` scheme and `80` for `http` | int | N |
9292
| `roles` | Mapping of catalog name to a role | dict | N |
93+
| `source` | Value to send as Trino's `source` field for query attribution / auditing. Default: `sqlmesh`. | string | N |
9394
| `http_headers` | Additional HTTP headers to send with each request. | dict | N |
9495
| `session_properties` | Trino session properties. Run `SHOW SESSION` to see all options. | dict | N |
9596
| `retries` | Number of retries to attempt when a request fails. Default: `3` | int | N |

docs/integrations/github.md

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -286,21 +286,22 @@ Below is an example of how to define the default config for the bot in either YA
286286

287287
### Configuration Properties
288288

289-
| Option | Description | Type | Required |
290-
|---------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------:|:--------:|
291-
| `invalidate_environment_after_deploy` | Indicates if the PR environment created should be automatically invalidated after changes are deployed. Invalidated environments are cleaned up automatically by the Janitor. Default: `True` | bool | N |
292-
| `merge_method` | The merge method to use when automatically merging a PR after deploying to prod. Defaults to `None` meaning automatic merge is not done. Options: `merge`, `squash`, `rebase` | string | N |
293-
| `enable_deploy_command` | Indicates if the `/deploy` command should be enabled in order to allowed synchronized deploys to production. Default: `False` | bool | N |
294-
| `command_namespace` | The namespace to use for SQLMesh commands. For example if you provide `#SQLMesh` as a value then commands will be expected in the format of `#SQLMesh/<command>`. Default: `None` meaning no namespace is used. | string | N |
295-
| `auto_categorize_changes` | Auto categorization behavior to use for the bot. If not provided then the project-wide categorization behavior is used. See [Auto-categorize model changes](https://sqlmesh.readthedocs.io/en/stable/guides/configuration/#auto-categorize-model-changes) for details. | dict | N |
296-
| `default_pr_start` | Default start when creating PR environment plans. If running in a mode where the bot automatically backfills models (based on `auto_categorize_changes` behavior) then this can be used to limit the amount of data backfilled. Defaults to `None` meaning the start date is set to the earliest model's start or to 1 day ago if [data previews](../concepts/plans.md#data-preview) need to be computed.| str | N |
297-
| `pr_min_intervals` | Intended for use when `default_pr_start` is set to a relative time, eg `1 week ago`. This ensures that at least this many intervals across every model are included for backfill in the PR environment. Without this, models with an interval unit wider than `default_pr_start` (such as `@monthly` models if `default_pr_start` was set to `1 week ago`) will be excluded from backfill entirely. | int | N |
298-
| `skip_pr_backfill` | Indicates if the bot should skip backfilling models in the PR environment. Default: `True` | bool | N |
299-
| `pr_include_unmodified` | Indicates whether to include unmodified models in the PR environment. Default to the project's config value (which defaults to `False`) | bool | N |
300-
| `run_on_deploy_to_prod` | Indicates whether to run latest intervals when deploying to prod. If set to false, the deployment will backfill only the changed models up to the existing latest interval in production, ignoring any missing intervals beyond this point. Default: `False` | bool | N |
301-
| `pr_environment_name` | The name of the PR environment to create for which a PR number will be appended to. Defaults to the repo name if not provided. Note: The name will be normalized to alphanumeric + underscore and lowercase. | str | N |
302-
| `prod_branch_name` | The name of the git branch associated with production. Ex: `prod`. Default: `main` or `master` is considered prod | str | N |
303-
| `forward_only_branch_suffix` | If the git branch has this suffix, trigger a [forward-only](../concepts/plans.md#forward-only-plans) plan instead of a normal plan. Default: `-forward-only` | str | N |
289+
| Option | Description | Type | Required |
290+
|---------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------:|:--------:|
291+
| `invalidate_environment_after_deploy` | Indicates if the PR environment created should be automatically invalidated after changes are deployed. Invalidated environments are cleaned up automatically by the Janitor. Default: `True` | bool | N |
292+
| `merge_method` | The merge method to use when automatically merging a PR after deploying to prod. Defaults to `None` meaning automatic merge is not done. Options: `merge`, `squash`, `rebase` | string | N |
293+
| `enable_deploy_command` | Indicates if the `/deploy` command should be enabled in order to allowed synchronized deploys to production. Default: `False` | bool | N |
294+
| `command_namespace` | The namespace to use for SQLMesh commands. For example if you provide `#SQLMesh` as a value then commands will be expected in the format of `#SQLMesh/<command>`. Default: `None` meaning no namespace is used. | string | N |
295+
| `auto_categorize_changes` | Auto categorization behavior to use for the bot. If not provided then the project-wide categorization behavior is used. See [Auto-categorize model changes](https://sqlmesh.readthedocs.io/en/stable/guides/configuration/#auto-categorize-model-changes) for details. | dict | N |
296+
| `default_pr_start` | Default start when creating PR environment plans. If running in a mode where the bot automatically backfills models (based on `auto_categorize_changes` behavior) then this can be used to limit the amount of data backfilled. Defaults to `None` meaning the start date is set to the earliest model's start or to 1 day ago if [data previews](../concepts/plans.md#data-preview) need to be computed. | str | N |
297+
| `pr_min_intervals` | Intended for use when `default_pr_start` is set to a relative time, eg `1 week ago`. This ensures that at least this many intervals across every model are included for backfill in the PR environment. Without this, models with an interval unit wider than `default_pr_start` (such as `@monthly` models if `default_pr_start` was set to `1 week ago`) will be excluded from backfill entirely. | int | N |
298+
| `skip_pr_backfill` | Indicates if the bot should skip backfilling models in the PR environment. Default: `True` | bool | N |
299+
| `pr_include_unmodified` | Indicates whether to include unmodified models in the PR environment. Default to the project's config value (which defaults to `False`) | bool | N |
300+
| `run_on_deploy_to_prod` | Indicates whether to run latest intervals when deploying to prod. If set to false, the deployment will backfill only the changed models up to the existing latest interval in production, ignoring any missing intervals beyond this point. Default: `False` | bool | N |
301+
| `pr_environment_name` | The name of the PR environment to create for which a PR number will be appended to. Defaults to the repo name if not provided. Note: The name will be normalized to alphanumeric + underscore and lowercase. | str | N |
302+
| `prod_branch_name` | The name of the git branch associated with production. Ex: `prod`. Default: `main` or `master` is considered prod | str | N |
303+
| `forward_only_branch_suffix` | If the git branch has this suffix, trigger a [forward-only](../concepts/plans.md#forward-only-plans) plan instead of a normal plan. Default: `-forward-only` | str | N |
304+
| `check_if_blocked_on_deploy_to_prod` | The bot normally checks if a PR is blocked from merging before deploying to production. Setting this to `False` will skip that check. Default: `True` | bool | N |
304305

305306
Example with all properties defined:
306307

examples/sushi/models/customers.sql

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,4 @@ LEFT JOIN (
4242
ON o.customer_id = m.customer_id
4343
LEFT JOIN raw.demographics AS d
4444
ON o.customer_id = d.customer_id
45-
WHERE sushi.orders.customer_id > 0
45+
WHERE o.customer_id > 0

pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,13 @@ dependencies = [
1818
"ipywidgets",
1919
"jinja2",
2020
"packaging",
21-
"pandas",
21+
"pandas<3.0.0",
2222
"pydantic>=2.0.0",
2323
"python-dotenv",
2424
"requests",
2525
"rich[jupyter]",
2626
"ruamel.yaml",
27-
"sqlglot[rs]~=27.28.0",
27+
"sqlglot[rs]~=28.10.0",
2828
"tenacity",
2929
"time-machine",
3030
"json-stream"

sqlmesh/core/config/connection.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1889,6 +1889,7 @@ class TrinoConnectionConfig(ConnectionConfig):
18891889
client_certificate: t.Optional[str] = None
18901890
client_private_key: t.Optional[str] = None
18911891
cert: t.Optional[str] = None
1892+
source: str = "sqlmesh"
18921893

18931894
# SQLMesh options
18941895
schema_location_mapping: t.Optional[dict[re.Pattern, str]] = None
@@ -1985,6 +1986,7 @@ def _connection_kwargs_keys(self) -> t.Set[str]:
19851986
"port",
19861987
"catalog",
19871988
"roles",
1989+
"source",
19881990
"http_scheme",
19891991
"http_headers",
19901992
"session_properties",
@@ -2042,7 +2044,7 @@ def _static_connection_kwargs(self) -> t.Dict[str, t.Any]:
20422044
"user": self.impersonation_user or self.user,
20432045
"max_attempts": self.retries,
20442046
"verify": self.cert if self.cert is not None else self.verify,
2045-
"source": "sqlmesh",
2047+
"source": self.source,
20462048
}
20472049

20482050
@property
@@ -2407,7 +2409,7 @@ def _connection_factory(self) -> t.Callable:
24072409
for tpe in subclasses(
24082410
__name__,
24092411
ConnectionConfig,
2410-
exclude=(ConnectionConfig, BaseDuckDBConnectionConfig),
2412+
exclude={ConnectionConfig, BaseDuckDBConnectionConfig},
24112413
)
24122414
}
24132415

@@ -2416,7 +2418,7 @@ def _connection_factory(self) -> t.Callable:
24162418
for tpe in subclasses(
24172419
__name__,
24182420
ConnectionConfig,
2419-
exclude=(ConnectionConfig, BaseDuckDBConnectionConfig),
2421+
exclude={ConnectionConfig, BaseDuckDBConnectionConfig},
24202422
)
24212423
}
24222424

@@ -2428,7 +2430,7 @@ def _connection_factory(self) -> t.Callable:
24282430
for tpe in subclasses(
24292431
__name__,
24302432
ConnectionConfig,
2431-
exclude=(ConnectionConfig, BaseDuckDBConnectionConfig),
2433+
exclude={ConnectionConfig, BaseDuckDBConnectionConfig},
24322434
)
24332435
}
24342436

0 commit comments

Comments
 (0)