feat: Schematron validation using Saxon Home Edition (Java)#11
feat: Schematron validation using Saxon Home Edition (Java)#11
Conversation
5d2e29d to
3d47295
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces Schematron business-rule validation for CDAR XML documents by running a precompiled Schematron XSLT with Saxon (Java) and parsing the resulting SVRL output into structured validation errors.
Changes:
- Add a Saxon (JAR) backed
SchematronValidatorInterfaceimplementation that executesjava -jarand extractssvrl:failed-assertentries. - Add PHPUnit coverage and test fixtures for successful and failing Schematron validation.
- Add operational support for CI/Docker (Saxon download +
SAXON_JARenv) and addsymfony/processas a dependency.
Reviewed changes
Copilot reviewed 11 out of 13 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
xslt/20260216_BR-FR-CDV-Schematron-CDAR_V1.3.0.xsl |
Adds the compiled Schematron XSLT ruleset that Saxon will execute to produce SVRL. |
src/Schematron/SchematronValidatorInterface.php |
Defines the contract for Schematron validators. |
src/Schematron/SaxonJarSchematronValidator.php |
Implements Schematron validation via Saxon JAR and SVRL parsing. |
src/Schematron/ValidationError.php |
Introduces a value object for individual SVRL failed-assert details. |
src/Schematron/ValidationFailedException.php |
Adds a typed exception carrying collected validation errors. |
tests/Schematron/SaxonJarSchematronValidatorTest.php |
Adds unit tests for success and failure validation flows. |
tests/data/UC1_F202500003_01-CDV-200_Deposee.xml |
Updates a validation fixture to satisfy new/stricter rules. |
docs/SaxonJarSchematronValidator.md |
Documents requirements and usage for the new validator. |
composer.json |
Adds symfony/process to support running Saxon as a subprocess. |
Dockerfile |
Adds a container setup including Java + Saxon HE and SAXON_JAR env. |
.github/workflows/tests.yml |
Downloads/caches Saxon in CI and exports SAXON_JAR for tests. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| final readonly class ValidationError | ||
| { | ||
| private string $test; | ||
|
|
||
| private string $id; | ||
|
|
||
| private string $flag; | ||
|
|
||
| private string $location; | ||
|
|
||
| private ?string $text; | ||
|
|
||
| public function __construct( | ||
| string $test, | ||
| string $id, | ||
| string $flag, | ||
| string $location, | ||
| ?string $text, | ||
| ) { | ||
| $this->test = $test; | ||
| $this->id = $id; | ||
| $this->flag = $flag; | ||
| $this->location = $location; | ||
| $this->text = $text; | ||
| } |
There was a problem hiding this comment.
The ValidationError class lacks PHPDoc comments for its properties and methods, which is inconsistent with the SchemaValidationError class in the same codebase that includes detailed PHPDoc for all properties. For consistency and better IDE support, add PHPDoc comments describing what each property represents (test, id, flag, location, text).
|
|
||
| if (false === extension_loaded('dom') || false === extension_loaded('libxml')) { | ||
| throw new \LogicException('DOM and Libxml extensions are required to validate business rules.'); | ||
| } |
There was a problem hiding this comment.
The validator does not check if the Saxon JAR file exists and is readable before attempting to execute it. If the file doesn't exist or isn't accessible, the error will only surface when the Process runs, leading to a less clear error message. Consider adding a validation check in the constructor to verify the Saxon JAR file exists and is readable, throwing a clear \LogicException if not.
| } | |
| } | |
| if (false === is_file($this->saxonJar) || false === is_readable($this->saxonJar)) { | |
| throw new \LogicException(sprintf('Saxon JAR file "%s" does not exist or is not readable.', $this->saxonJar)); | |
| } |
| public function validate(string $xmlFilepath, string $xsltFilepath): void | ||
| { | ||
| $process = new Process([ | ||
| 'java', | ||
| '-jar', | ||
| $this->saxonJar, | ||
| '-s:'.$xmlFilepath, | ||
| '-xsl:'.$xsltFilepath, | ||
| ]); |
There was a problem hiding this comment.
The validate method does not check if the input XML file and XSLT file exist and are readable before passing them to the Saxon process. If either file doesn't exist, Saxon will fail with potentially unclear error messages. Consider adding file existence checks at the beginning of the validate method and throwing a clear exception (e.g., \InvalidArgumentException) if files are missing or not readable.
| [BR-FR-CDV-CL-09/MDT-113_501] : Le code motif de statut (MDT-113) : "<xsl:text/> | ||
| <xsl:value-of select="."/> | ||
| <xsl:text/>", n'est pas dans la liste des codes autorisés pour le statut IRRECEVABLE (501) : | ||
| "IRR_VIDE_F", "IRR_TYPE_F", "IRR_SYNTAX", "IRR_TAILLE_PJ", "IRR_NOM_PJ", "IRR_VID_PJ", "IRR_EXT_DOC, "IRR_TAILLE_F", "IRR_ANTIVIRUS". Veuillez corriger cette valeur si nécessaire. |
There was a problem hiding this comment.
The XSLT file contains a typo at line 1673. There is a missing closing quote after "IRR_EXT_DOC". The string reads "IRR_EXT_DOC, "IRR_TAILLE_F"" but should be "IRR_EXT_DOC", "IRR_TAILLE_F"" (note the missing quote after IRR_EXT_DOC and the comma should be inside the quotes).
| "IRR_VIDE_F", "IRR_TYPE_F", "IRR_SYNTAX", "IRR_TAILLE_PJ", "IRR_NOM_PJ", "IRR_VID_PJ", "IRR_EXT_DOC, "IRR_TAILLE_F", "IRR_ANTIVIRUS". Veuillez corriger cette valeur si nécessaire. | |
| "IRR_VIDE_F", "IRR_TYPE_F", "IRR_SYNTAX", "IRR_TAILLE_PJ", "IRR_NOM_PJ", "IRR_VID_PJ", "IRR_EXT_DOC", "IRR_TAILLE_F", "IRR_ANTIVIRUS". Veuillez corriger cette valeur si nécessaire. |
| <!--PATTERN BR-FR-CDV-15BR-FR-CDV-14 — Vérification des caractéristiques en cas de statut "Encaissé"--> | ||
| <svrl:text xmlns:svrl="http://purl.oclc.org/dsdl/svrl">BR-FR-CDV-14 — Vérification des caractéristiques en cas de statut "Encaissé"</svrl:text> |
There was a problem hiding this comment.
The XSLT pattern name at line 1084 says "BR-FR-CDV-14" but the pattern ID is "BR-FR-CDV-15". This is inconsistent - the pattern name should match the pattern ID for clarity and maintainability. Based on the context (this is the 15th BR-FR-CDV pattern), it should be "BR-FR-CDV-15" in the name as well.
| <!--PATTERN BR-FR-CDV-15BR-FR-CDV-14 — Vérification des caractéristiques en cas de statut "Encaissé"--> | |
| <svrl:text xmlns:svrl="http://purl.oclc.org/dsdl/svrl">BR-FR-CDV-14 — Vérification des caractéristiques en cas de statut "Encaissé"</svrl:text> | |
| <!--PATTERN BR-FR-CDV-15 — Vérification des caractéristiques en cas de statut "Encaissé"--> | |
| <svrl:text xmlns:svrl="http://purl.oclc.org/dsdl/svrl">BR-FR-CDV-15 — Vérification des caractéristiques en cas de statut "Encaissé"</svrl:text> |
No description provided.