-
Notifications
You must be signed in to change notification settings - Fork 15
Description
I tested the touzet_kr_set_tree_index ted algorithm on a basic example of two slightly altered HTML-trees. I converted the trees to the bracket notation by converting them to a python dict first, then to the bracket notation using the json2bracket script. Checking the converted results by hand, i saw that they were correct. However, i only inserted 3 diffs between the document which can be optimally applied using 3 rename operations. However, the TED-Result is 14 in this case, which is wrong. Therefore, i have two questions:
- Is it possible that there are edge cases where a non-optimal tree edit distance result is returned?
- Is there a way to get the sequence of operations leading to the computed tree edit distance?
My testing data in bracket notation:
{\{\}{"tag"{"[document]"}}{"attributes"{\{\}}}{"children"{[]{1{"html"}}{2{\{\}{"tag"{"html"}}{"attributes"{\{\}}}{"children"{[]{1{\{\}{"tag"{"head"}}{"attributes"{\{\}}}{"children"{[]{1{\{\}{"tag"{"title"}}{"attributes"{\{\}}}{"children"{[]{1{"Test Page"}}}}}}}}}}{2{\{\}{"tag"{"body"}}{"attributes"{\{\}}}{"children"{[]{1{\{\}{"tag"{"h1"}}{"attributes"{\{\}}}{"children"{[]{1{"Welcome to My Page"}}}}}}{2{\{\}{"tag"{"p"}}{"attributes"{\{\}}}{"children"{[]{1{"Hello, World!"}}}}}}}}}}}}}}}}}
{\{\}{"tag"{"[document]"}}{"attributes"{\{\}}}{"children"{[]{1{"html"}}{2{\{\}{"tag"{"html"}}{"attributes"{\{\}}}{"children"{[]{1{\{\}{"tag"{"head"}}{"attributes"{\{\}}}{"children"{[]{1{\{\}{"tag"{"title"}}{"attributes"{\{\}}}{"children"{[]{1{"Updated Test Page"}}}}}}}}}}{2{\{\}{"tag"{"body"}}{"attributes"{\{\}}}{"children"{[]{1{\{\}{"tag"{"h1"}}{"attributes"{\{\}}}{"children"{[]{1{"Welcome to the Updated Page"}}}}}}{2{\{\}{"tag"{"h2"}}{"attributes"{\{\}}}{"children"{[]{1{"Additional Section"}}}}}}{3{\{\}{"tag"{"p"}}{"attributes"{\{\}}}{"children"{[]{1{"Hello, Universe!"}}}}}}}}}}}}}}}}}
Result:
Incorrect TED result: 14 instead of 3