Skip to content

[Experimental] SQL grammar definitions#516

Draft
rkistner wants to merge 14 commits intomainfrom
sql-grammar
Draft

[Experimental] SQL grammar definitions#516
rkistner wants to merge 14 commits intomainfrom
sql-grammar

Conversation

@rkistner
Copy link
Contributor

@rkistner rkistner commented Feb 19, 2026

This adds grammar for our supported SQL syntax, in W3C EBNF syntax. The idea is to have a formal reference on what SQL syntax we support.

The goal is to eventually have public documentation on the exact syntax we support, similar to the SQLite documentation. The grammar in the current format isn't quite ready to use as-is for that, but it's a step in that direction.

These grammar definitions are not used anywhere in the actual code - this is just for documentation and validation.

This grammar definitions is split between our three different parsers:

  1. Bucket definitions
  2. Sync streams alpha
  3. Sync streams new compiler

To confirm that our implementation matches the grammar, this adds tests that runs queries through both our parser and a parser generated purely from the grammar. These do not test the behavior/output of the queries at all - just checks whether the queries pass the parsing stage or not.

The test queries fall into three different categories:

  1. Query is supported and passes both parsers - "accepted" in the fixtures.
  2. Query is not supported, and fails on both parsers - "rejected_syntax" in the fixtures.
  3. Query is valid according to the grammar, but fails further validation - "rejected_semantic" in the fixtures.

We don't currently distinguish between syntax and semantic errors in our parser, so we can't test "rejected_semantic" actually fails with a semantic error rather than syntax error, but it's good enough for now.

Note: The majority of the grammar syntax and test fixtures were generated using Codex.

Fixes

The tests picked up an error in the parsing of BETWEEN statements and in != operators in the new compiler, due to the location not being set. This fixes it.

There is a pending issue still - NOT IN [json string] works in the alpha sync streams parser, but not the new compiler.

@rkistner rkistner requested a review from simolus3 February 19, 2026 11:39
@changeset-bot
Copy link

changeset-bot bot commented Feb 19, 2026

⚠️ No Changeset found

Latest commit: 8d929d6

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

@simolus3 simolus3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really neat, it's nice that we're able to find parser bugs this way.

I can take a look at remaining issues for the new compiler.

- This grammar is used when config.sync_config_compiler = true (and edition >= 2)
*/

SyncStreamsCompilerSql ::= CompilerStreamQuery | CompilerSubquery | CompilerCteSubquery
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove CompilerSubquery here? It's not a top-level block that could be parsed, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments