Skip to content
This repository was archived by the owner on Jan 8, 2026. It is now read-only.
This repository was archived by the owner on Jan 8, 2026. It is now read-only.

Node values (text) broken by sub-nodes #38

@helmersl

Description

@helmersl

Hi,

I'm using xml2json to parse publications and have the following problem:

If the text in the abstract node contains XML-tags, the text following these seub-nodes is just ignored in the JSON output, so the xml-example:

     <Abstract>
          <AbstractText>As photosynthetic prokaryotes, cyanobacteria can directly convert CO<sub>2</sub> to organic compounds and grow rapidly using sunlight as the sole source of energy. The direct biosynthesis of chemicals from CO<sub>2</sub> and sunlight in cyanobacteria is therefore theoretically more attractive than using glucose as carbon source in heterotrophic bacteria. To date, more than 20 different target chemicals have been synthesized from CO<sub>2</sub> in cyanobacteria. However, the yield and productivity of the constructed strains is about 100-fold lower than what can be obtained using heterotrophic bacteria, and only a few products reached the gram level. The main bottleneck in optimizing cyanobacterial cell factories is the relative complexity of the metabolism of photoautotrophic bacteria. In heterotrophic bacteria, energy metabolism is integrated with the carbon metabolism, so that glucose can provide both energy and carbon for the synthesis of target chemicals. By contrast, the energy and carbon metabolism of cyanobacteria are separated. First, solar energy is converted into chemical energy and reducing power via the light reactions of photosynthesis. Subsequently, CO<sub>2</sub> is reduced to organic compounds using this chemical energy and reducing power. Finally, the reduced CO<sub>2</sub> provides the carbon source and chemical energy for the synthesis of target chemicals and cell growth. Consequently, the unique nature of the cyanobacterial energy and carbon metabolism determines the specific metabolic engineering strategies required for these organisms. In this chapter, we will describe the specific characteristics of cyanobacteria regarding their metabolism of carbon and energy, summarize and analyze the specific strategies for the production of chemicals in cyanobacteria, and propose metabolic engineering strategies which may be most suitable for cyanobacteria.</AbstractText>
        </Abstract>

is converted to JSON as:

('Abstract',
                                          OrderedDict([('AbstractText',
                                                        OrderedDict([('$',
                                                                      'As photosynthetic prokaryotes, cyanobacteria can directly convert CO'),
                                                                     ('sub',
                                                                      [OrderedDict([('$',
                                                                                     2)]),
                                                                       OrderedDict([('$',
                                                                                     2)]),
                                                                       OrderedDict([('$',
                                                                                     2)]),
                                                                       OrderedDict([('$',
                                                                                     2)]),
                                                                       OrderedDict([('$',
                                                                                     2)])])]))]))

Due to the sub-tags.

Is there a way to fix this problem?

Thanks!
Lea

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions