Skip to content

Jtd to esm validator#137

Draft
simbo1905 wants to merge 9 commits intomainfrom
cursor/jtd-to-esm-validator-c3c6
Draft

Jtd to esm validator#137
simbo1905 wants to merge 9 commits intomainfrom
cursor/jtd-to-esm-validator-c3c6

Conversation

@simbo1905
Copy link
Owner


What changed

  • Added a new Maven module jtd-esm-codegen providing an experimental CLI tool to generate ES2020 JavaScript validator modules from JTD schemas.
  • Implemented JTD parsing for a restricted "flat" schema subset, an AST, and an ES2020 JavaScript renderer with SHA-256 hashing for deterministic output filenames.
  • Introduced a new GitHub Actions workflow (.github/workflows/jtd-esm-codegen-release.yml) for nightly builds of the CLI (uber JAR and GraalVM native images for Linux, Windows, macOS Intel/ARM).
  • Updated README.md to document the new tool and its limited scope.

Why this change is needed

This experimental tool provides a lightweight solution for client-side JSON schema validation in browser environments, particularly for event payloads exchanged via BroadcastChannel. It enables developers to generate vanilla ES2020 JavaScript modules without a build step, ensuring data integrity with minimal overhead and direct browser compatibility.

How were these changes tested

  • Unit Tests: Comprehensive unit tests cover JTD parsing (including rejection of unsupported features), AST construction, and ES2020 rendering logic.
  • Integration Tests: Node.js-based integration tests execute the generated JavaScript modules to verify the validate function's behavior against a suite of valid and invalid JSON instances, ensuring RFC 8927 compliant error reporting.
  • Build Verification: Maven build (./mvnw clean install) was run to confirm successful compilation and packaging of the uber JAR.

Checklist

  • Code builds / passes tests
  • New tests added if needed
  • Update to use CODING_STYLE_LLM.md convensions
  • Documentation updated if needed
  • AGENTS.md updated if appropriate

Open in Cursor Open in Web

cursoragent and others added 5 commits February 5, 2026 14:13
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
@cursor
Copy link

cursor bot commented Feb 5, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@gemini-code-assist
Copy link

Summary of Changes

Hello @simbo1905, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces an experimental command-line tool designed to convert a specific, simplified subset of JSON Type Definition (JTD) schemas into standalone ES2020 JavaScript validator modules. The primary motivation is to enable lightweight, client-side JSON schema validation in browser environments, particularly for scenarios like validating event payloads exchanged via BroadcastChannel, without necessitating a complex build pipeline. The tool focuses on generating vanilla JavaScript for direct browser compatibility and minimal overhead.

Highlights

  • New jtd-esm-codegen Module: Introduces an experimental Maven module to generate ES2020 JavaScript validator modules from JTD schemas.
  • Limited JTD Schema Support: Implements JTD parsing for a restricted 'flat' schema subset, an Abstract Syntax Tree (AST), and an ES2020 JavaScript renderer.
  • Deterministic Output: Utilizes SHA-256 hashing to ensure deterministic output filenames for the generated JavaScript modules.
  • Automated Builds: Adds a GitHub Actions workflow for nightly builds of the CLI tool, producing uber JARs and GraalVM native images for various platforms.
  • Documentation Update: The README.md has been updated to include details about the new tool, its functionality, and its deliberately limited scope.
Changelog
  • README.md
    • Updated to include a new section detailing the jtd-esm-codegen module, its purpose, supported JTD features, and instructions for building and running.
  • jtd-esm-codegen/pom.xml
    • Added a new Maven module defining the jtd-esm-codegen project, including its dependencies, build properties, and configuration for creating an executable uber JAR and GraalVM native images.
  • jtd-esm-codegen/src/main/java/io/github/simbo1905/json/jtd/codegen/EsmRenderer.java
    • New file providing the core logic for rendering the JTD Abstract Syntax Tree (AST) into an ES2020 JavaScript module, including type checks and error reporting.
  • jtd-esm-codegen/src/main/java/io/github/simbo1905/json/jtd/codegen/JtdAst.java
    • New file defining the data structures for the Abstract Syntax Tree (AST) that represents the supported subset of JTD schemas.
  • jtd-esm-codegen/src/main/java/io/github/simbo1905/json/jtd/codegen/JtdParser.java
    • New file implementing a parser that converts JTD JSON schemas into the internal AST representation, with strict validation for the supported 'flat' schema subset.
  • jtd-esm-codegen/src/main/java/io/github/simbo1905/json/jtd/codegen/JtdToEsmCli.java
    • New file serving as the command-line interface entry point for the JTD to ESM code generation tool, handling file input/output and orchestrating the parsing and rendering process.
  • jtd-esm-codegen/src/main/java/io/github/simbo1905/json/jtd/codegen/Sha256.java
    • New utility file containing methods for calculating SHA-256 hashes of input files, used for generating deterministic output filenames.
  • jtd-esm-codegen/src/test/java/io/github/simbo1905/json/jtd/codegen/JtdEsmCodegenLoggingConfig.java
    • New test utility file to configure Java Util Logging (JUL) for consistent logging behavior across test classes.
  • jtd-esm-codegen/src/test/java/io/github/simbo1905/json/jtd/codegen/JtdToEsmCodegenTest.java
    • New test file containing unit and integration tests for the JTD to ESM code generator, including tests for schema parsing, rendering, and Node.js-based execution of generated validators.
  • jtd-esm-codegen/src/test/resources/odc-chart-event-v1.jtd.json
    • New test resource file providing an example JTD schema used in the integration tests.
  • pom.xml
    • Updated the parent Maven project configuration to include the newly added jtd-esm-codegen module.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/jtd-esm-codegen-release.yml
Activity
  • The author, simbo1905, has added a new Maven module and associated source code for the JTD to ESM validator.
  • Comprehensive unit tests have been implemented to cover JTD parsing, AST construction, and ES2020 rendering logic.
  • Node.js-based integration tests were introduced to verify the behavior of the generated JavaScript modules against valid and invalid JSON instances.
  • The Maven build process was verified to ensure successful compilation and packaging of the uber JAR.
  • Documentation in README.md has been updated to reflect the new tool.
  • A new GitHub Actions workflow has been added for nightly builds.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an experimental CLI tool to generate ES2020 JavaScript validator modules from JTD schemas. The changes are well-structured, introducing a new Maven module jtd-esm-codegen with clear separation of concerns for parsing, AST representation, and code generation. The addition of comprehensive unit and integration tests, including Node.js-based validation of the generated code, is excellent and ensures high quality. The documentation in the README is also clear and helpful.

I've found a couple of minor areas for improvement to enhance maintainability and simplify the code, which are detailed in the comments. Overall, this is a very solid contribution.

Comment on lines +39 to +50
static String hexPrefix8(byte[] digest) {
// 8 hex chars == 4 bytes.
if (digest.length < 4) {
throw new IllegalArgumentException("digest too short: " + digest.length);
}
final var out = new StringBuilder(8);
for (int i = 0; i < 4; i++) {
final byte b = digest[i];
out.append(HEX[(b >>> 4) & 0x0F]).append(HEX[b & 0x0F]);
}
return out.toString();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The hexPrefix8 method duplicates the hex-conversion logic from the hex method. To improve maintainability and adhere to the Don't Repeat Yourself (DRY) principle, you can refactor hexPrefix8 to reuse the hex method by passing it just the first 4 bytes of the digest.

    static String hexPrefix8(byte[] digest) {
        // 8 hex chars == 4 bytes.
        if (digest.length < 4) {
            throw new IllegalArgumentException("digest too short: " + digest.length);
        }
        final byte[] prefixBytes = new byte[4];
        System.arraycopy(digest, 0, prefixBytes, 0, 4);
        return hex(prefixBytes);
    }

Comment on lines +21 to +29
try {
targetLevel = Level.parse(levelProp.trim());
} catch (IllegalArgumentException ex) {
try {
targetLevel = Level.parse(levelProp.trim().toUpperCase(Locale.ROOT));
} catch (IllegalArgumentException ignored) {
log.warning(() -> "Unrecognized logging level from 'java.util.logging.ConsoleHandler.level': " + levelProp);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The nested try-catch block for parsing the logging level can be simplified. According to the java.util.logging.Level.parse documentation for Java 21, level names are parsed case-insensitively. Therefore, a single call to Level.parse is sufficient, and the second attempt with toUpperCase is redundant.

            try {
                targetLevel = Level.parse(levelProp.trim());
            } catch (IllegalArgumentException ex) {
                log.warning(() -> "Unrecognized logging level from 'java.util.logging.ConsoleHandler.level': " + levelProp);
            }

…on bug (#31)

- Add junit-js 2.0.0 dependency with GraalVM exclusions to avoid version conflicts
- Add JUnit Vintage engine to run JUnit 4 tests under JUnit 5 Platform
- Configure Surefire to discover *TestSuite.java files
- Delete bun-based src/test/js/ directory (replaced by junit-js)
- Add JtdEsmJsTestSuite.java to run JS tests via @RunWith(JSRunner.class)
- Add boolean-schema.test.js and nested-elements-empty-focused.test.js
  - Tests run in GraalVM polyglot, no external JS runtime needed
- Fix EsmRenderer bug where inline validator functions were never emitted:
  - Add generateInlineFunctions() method to emit collected inline validators
  - Fix collision issue by using counter instead of hashCode for function names
  - Support nested elements/schemas that require multiple inline validators

Test results: 29 tests pass in jtd-esm-codegen (17 Java + 2 property + 10 JS)
@simbo1905 simbo1905 force-pushed the cursor/jtd-to-esm-validator-c3c6 branch from 76b0166 to 52f92e1 Compare February 7, 2026 16:46
@simbo1905
Copy link
Owner Author

Response to Gemini Code Assist review comments:

Comment 1 (Sha256.java hex duplication): Acknowledged. The hex conversion logic is indeed duplicated between hex and hexPrefix8 methods. This is an intentional trade-off to keep each method self-contained and avoid extracting a micro-helper for just 2 lines. The current code is correct and readable. Marking as resolved.

Comment 2 (JtdEsmCodegenLoggingConfig.java nested try-catch): Acknowledged. The nested try-catch handles the case where the logging level property might be in lowercase (e.g., 'fine' vs 'FINE'). While Level.parse is technically case-sensitive, this provides a better UX for users. The code works correctly. Marking as resolved.

Both comments are minor style suggestions that don't affect functionality. The jtd-esm-codegen module now has 29 passing tests including 10 new JS tests via junit-js (replacing bun). CI builds are green.

- Remove all bun-specific code (ProcessBuilder runners, isBunAvailable checks)
- Replace with GraalVM Polyglot JS in-process execution via jsContext()/evalModule()
- Add GraalVM helper methods: jsContext(), evalModule(), errCount()
- Simplify test schemas (inline them instead of separate variables)
- Add generatedDiscriminatorValidatorWorks test
- Add JTD_CODEGEN_SPEC.md documentation
- All 17 tests pass, no external JS runtime required
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants