Skip to content Skip to sidebar Skip to footer

Lxml.etree Iterparse() And Parsing Element Completely

I have an XML file with nodes that looks like this: 41.3681107

Solution 1:

Are you sure that you don't call e.g. element.clear() after your conditional statement, like this?

forevent, element in etree.iterparse(infile, events=("start", "end")):
  if element.tag == NAMESPACE + 'trkpt'andevent == 'end':
    for child inlist(element):
        print child.text
  element.clear()

The problem is that the parser issues the events for the child elements before it sends the end event for trkpt (because it encounters the end tags of the nested elements first). If you do any modifications to the parsed elements before the end event is called for the outer element, the behaviour you describe may occur.

Consider the following alternative:

forevent, element in etree.iterparse(infile, events=('end',),
    tag=NAMESPACE + 'trkpt'):for child in element:
     print child.text
  element.clear()

Solution 2:

are you trying to use iterparse explicitly or can you use other methods.

e.g.

from lxml import etree

tree = etree.parse('/path/to/file')
root = tree.getroot()
for elements in root.findall('trkpt'):
    for child in elements:
        print child.text

lxml is pretty good at parsing and not taking up too much memory...not sure if this solves your problem or if you are trying to use the specific method above.

Post a Comment for "Lxml.etree Iterparse() And Parsing Element Completely"