Skip to content

Conversation

@mcb5637
Copy link
Collaborator

@mcb5637 mcb5637 commented Nov 6, 2025

TODO:

  • ID reification
  • ID exclusivity
  • lang tag
  • xml literal
  • cleanup
  • check dependencies
  • integrate into IStreamQuadIterator

@mcb5637 mcb5637 linked an issue Nov 6, 2025 that may be closed by this pull request
@mcb5637 mcb5637 requested a review from liss-h November 27, 2025 15:10
Copy link
Collaborator

@liss-h liss-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first review about architecture and design (not logic).

In general: please split the xml parse into multiple files. One file per state would be good. And please add comments to explain why stuff is happening, what each state is good for and so on. This is very hard to follow even if you know what rdf-xml roughly looks like.

I very much like the idea with the states. Just needs to be organized a little.

@liss-h
Copy link
Collaborator

liss-h commented Nov 28, 2025

If you split it into multiple files just keep the definitions inline in the structs that makes it easier to read I think

Copy link
Collaborator

@liss-h liss-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes.

As I said before, I do like the idea of the states. However, the concrete implementation is not optimal.

The main concern here is the circular nature of the relationships of the states and the parser. The states can just arbitrarily modify the parser through the given ImplXML &. This is not good for general understandability and correctness.

I suggest a more functional approach with state transitions and secondary actions (something like fn(CurrentState, Input) -> Transition).

The state transitions determine which state the parser will enter next (or that it should pop a state). The secondary actions are emitting triples or errors.

Something like this:

struct Transition {
    using Action = std::variant<StatePop, StatePush, EmitStatement, EmitError>;
    std::vector<Action> actions_;
};

So instead of directly mutating the parser, the states tell the parser what to do (functional/data oriented design). There is a closed set of possible StateTransition actions that each have a clear purpose. (Rust can model this much cleaner than C++, but it still works). Performance is mostly a secondary concern here; maintainability is more important for rdf/xml (due to the complexity of the specification and since there are basically no large rdf/xml datasets).

The whole goal is to get rid of the circular mutability that is currently present and limit what states can do to the parser. That should make it both safer and easier to follow.

For this it makes sense to move some of the functionality out of the ImplXML (e.g. the static functions).

Note: I have not fully thought this through (since it would require me to do the refactor myself), but if you have questions or concerns, feel free to ask.

@mcb5637 mcb5637 requested a review from liss-h December 9, 2025 16:41
Copy link
Collaborator

@liss-h liss-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the uses of nested structs (you can keep the ones where ImplXYZ is nested in IStreamQuadIterator for now)

@mcb5637 mcb5637 requested a review from liss-h December 12, 2025 12:50
Copy link
Collaborator

@liss-h liss-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes. They really make the code much easier to read. Here is another review, I expect that I will write another (probably final one) after these comments are addressed.

This one is mostly a request for some more example driven documentation and some smaller issues.

…parse_xml

# Conflicts:
#	private/rdf4cpp/parser/XMLParser.cpp
#	private/rdf4cpp/parser/XMLParser.hpp
@mcb5637 mcb5637 requested a review from liss-h January 9, 2026 14:26
Copy link
Collaborator

@liss-h liss-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the changes. While this is still complex (because rdf-xml just is complex), I think it is very well structured.

I am just missing a few comments here an there to explain things.

Additionally, you use std::in_place_type_t<XYZ>{} very often, please just use std::in_place_type<XYZ> (without _t and curly braces).

}

StateTransition CollectionState::on_start_element(XMLOutputQueue &out, std::string_view const local_name, std::string_view const uri, std::span<XMLAttribute> const attributes, XMLStateInfo const &info) {
auto [transition, obj] = DescriptionState::enter(out, local_name, uri, attributes, info);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if I put something in the collection that is not a Description?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see added comment

@liss-h
Copy link
Collaborator

liss-h commented Jan 20, 2026

@nkaralis Do you still have the time to implement the tests for this or should the task be delegated?

@nkaralis
Copy link
Contributor

I will implement them.

@nkaralis
Copy link
Contributor

@mcb5637 Have a look at the commented out tests. In the negative tests, there are multiple that are failing

@mcb5637 mcb5637 marked this pull request as ready for review January 21, 2026 16:38
@mcb5637 mcb5637 requested a review from liss-h January 21, 2026 16:38
Copy link
Collaborator

@liss-h liss-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work you put in @mcb5637. I think this is ready to be merged.

@liss-h liss-h changed the title Feature/parse xml Feature: RDF/XML parser Jan 26, 2026
@liss-h liss-h merged commit af4d1a2 into develop Jan 26, 2026
18 checks passed
@liss-h liss-h mentioned this pull request Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

XML parser

4 participants