Creating Dictionary From Xml File
Solution 1:
The first big problem I see is that you're only searching for the first Enzyme within any given Organism. If you wanted to find each incidence of Enzyme, you should use:
for enzyme in each_organism.findall('Enzyme'):
# add to dictionary here
The second problem is that the format of your XML doesn't match the data relations you seem to be building with your dictionary. Within the XML, Enzyme, Motif, and Name are all children of Organism, but you're assigning motif as a value associated with the enzyme key. You have no way of knowing, necessarily, when iterating through incidences of and which one should be associated with the other, because they're all jammed together without any logical separation in the object.
I could be misunderstanding your purpose here, but it seems like you'd be better served by constructing Organism and Enzyme class objects rather than to force two (apparently) unrelated concepts into a key-value relationship.
This could look like so, and encapsulate your fields:
classOrganism:# where enzymes is an iterable of Enzymedef__init__(self, name, enzymes):
self.name = name
self.enzymes = enzymes
and your Enzyme object:
classEnzyme:# where motifs is an iterable of stringdef__init__(self, motifs):
self.motifs = motifs
All this would still require some sort of change in your XML file. Unless you just parse it by line (which is decidedly not the point of XML), I can't think of any easy ways you'd be able to figure out which Motifs belong to which Enzyme right now.
Edit: seeing as you're asking about ways to just iterate fairly blindly through each Enzyme node, and assuming that you always have a single Name element, that you have one Motif for each Enzyme, and every element after Name is Enzymes then Motif (e.g. E-M-E-M etc.) you should be able to do this:
i=0enzymes= []
motifs= []
for element in each_organism:# skip the first Name childifi==0:continue# if we're at an odd index, indicating an enzymeifi%2==1:enzymes.append(element.text)# if we're at an even index, indicating the related motifelifi%2==0:motifs.append(element.text)i+=1
Then, presuming every assumption I laid out, and probably a couple more (I'm not even 100% sure etree always iterates elements top-down), hold true, any motif at any given index in motifs will belong to the enzyme at the same index in enzymes. In case I haven't already made it clear: this is incredibly brittle code.
Post a Comment for "Creating Dictionary From Xml File"