Skip to content

Conversation

@lschmierer
Copy link
Contributor

@lschmierer lschmierer commented Jan 3, 2026

Hi Steve,
this PR implements full XML support (R4, R4B, R5 and R6 finally roundtrip successfully after a lot of back and forth).

XML vs JSON model

While most of the mediation between the XML and JSON models is handled by dedicated XML serializer and deserializer, I had to make a few changes to the derive macros. This is mostly to reduce some complexity and the need for extensive buffering and reordering of primitive id and extension. Otherwise one would have to translate e.g. brithDate (XML) to birthDate, _birthDate (JSON) model, just to have the derived Deserialize impl convert it back to the nested birthDate (sturct) element.
The TmpXxx structs now use to helper enums in the Deserialize impls:

  • PrimitiveOrElement to be able to handle both the birthDate, _birthDate (JSON) and brithDate (XML) natively.
  • SingleOrVec for sequence fields. This is required because the XML deserializer does not have extensive knowledge of the FHIR model and can thus not know if it needs to call visit_seq or visit_xxx (for single-cardinality fields).

A potential drawback of this approach, is that the deserializer is more lenient for some invalid JSON inputs. E.g. ”birthDate”: { “value”: “..”, “id”: “..”, “extension”: {..}} although invalid FHIR JSON.
Similar for passing single values where a sequence would be expected.
In my opinion this trade off is acceptable, given the extensive discussion about worse tradeoffs we had in #20.
But, please let me know what you think!

JSON Benchmark (no regression)

Unfortunately the above mentioned changes resulted in some significant performance regressions.
I will try to investigate and improve on this, but I wanted to share the current state with you.

The current implementation shows almost no regression.

Performance Differences:
-------------------------------------------
Comparing xml vs main:

R4: 2.00% slower (xml:helios-serde vs main:helios-fhir) - REGRESSION
R4B: 2.00% slower (xml:helios-serde vs main:helios-fhir) - REGRESSION
R5: 3.00% slower (xml:helios-serde vs main:helios-fhir) - REGRESSION
R6: 1.00% slower (xml:helios-serde vs main:helios-fhir) - REGRESSION

Initially I removed a bunch of JSON skips, that actually seem to work properly.
One of the later commits re-added them to make the numbers across branches comparable. Do you want to keep these skips or shall we drop the commit that reintroduces them again?

serde-support

I added a new helper crate serde-support to hold the new PrimitiveOrElement and SingleOrVec helpers.
I also moved the preexisting IdAndExtensionHelpers there, which were previously generated as nested structs all over again for each individual FHIR struct.
This improves the compile time somewhat (almost negligible in my local tests) but the final binary size signficicantly (test_examples binary from > 2GB after adding PrimitiveOrElement and SingleOrVec to 1,23GB; with debug symbols though).

We can think about moving the helpers into a private mod within crate/fhir/src. This would require moving [test_json_serde.rs](http://test_json_serde.rs/) and [test_xml_serde.rs](http://test_xml_serde.rs/) into crate/fhir/src though as these need access to the helpers.
Interestingly even with having [test_json_serde.rs](http://test_json_serde.rs/) and [test_xml_serde.rs](http://test_xml_serde.rs/) in crate/fhir/test, they can still not access private identifiers from crate/fhir/src.

Let me know which approach you prefer:

  • Dedicated serde-support (current state, my 1st choice):
Separation of src and tests (in helios-serde); but additional crate not intended for public use
  • Public serde_support package in helios-fhir (my last choice):
Noisy public API for helios-fhir`
  • Private serde_support package in helios-fhir (my 2nd choice):
test_json_serde.rsandtest_xml_serde.rsneed to be moved into thecrates/fhir/src` tree; tests are less separated by concern

Left Todo

  • Discuss and incorporate feedback
  • Try to debug and resolve JSON performance regressions
  • If you agree with the approach, I will cleanup the commit history before merging this (got a little messy by trying different approaches etc.)

Potential future improvements

To the derive macro

  • Directly deserialize into primitives instead of going through serde_json::Value
  • Remove any sede_json references within xml package

@smunini
Copy link
Contributor

smunini commented Jan 4, 2026

Thank you Lukas! I'll study this and get back you you on these questions. Agreed that we need to address the performance regression for sure.

@lschmierer
Copy link
Contributor Author

lschmierer commented Jan 4, 2026

I was able to address the performance regressions by manually implementing serde::Deserialize on the helpers vs. using the serde derive macro with #[serde(untagged)]

Performance Differences:
-------------------------------------------
Comparing xml vs main:

R4: 2.00% slower (xml:helios-serde vs main:helios-fhir) - REGRESSION
R4B: 2.00% slower (xml:helios-serde vs main:helios-fhir) - REGRESSION
R5: 3.00% slower (xml:helios-serde vs main:helios-fhir) - REGRESSION
R6: 1.00% slower (xml:helios-serde vs main:helios-fhir) - REGRESSION

@smunini regarding reviewing this, it probably does not make much sense to go through each commits individually. You should focus on the final state.
I will try to refactor the commit history to be more streamlined and squash a lot of the small fix commits.

@smunini
Copy link
Contributor

smunini commented Jan 5, 2026

Thank you Lukas! I should be able to review this PR in the next day or so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants