python script to extract phrases from an XML file -


i trying parse xml file contains these tags.

<?xml version="4.0" encoding="utf-8"?> <phrases>   <phrase title="bacd_dd" version_id="10" version_string="lphaf"><![cdata[bacd dsfbsd dfsd]]></phrase>   <phrase title="bcvd_ff" version_id="10" version_string="lphaf"><![cdata[ans fkdfjid dfdf]]></phrase>   <phrase title="bdsd_fffd" version_id="17" version_string="lphaf 7"><![cdata[jdhfd dsfodf wernksdlg ffguywer  <br> dsf sddsfdsfdsf ksdfj fdsf]]></phrase> </phrases> 

now want tag values. how can parse whole xml file ?

try xml.etree

import xml.etree.elementtree et root = et.fromstring("""<?xml version="1.0" encoding="utf-8"?> <phrases>   <phrase title="bacd_dd" version_id="1010010" version_string="1.1.0 alpha"><![cdata[bacd dsfbsd dfsd]]></phrase>   <phrase title="bcvd_ff" version_id="1010010" version_string="1.1.0 alpha"><![cdata[ans fkdfjid dfdf]]></phrase>   <phrase title="bdsd_fffd" version_id="1000017" version_string="1.0.0 alpha 7"><![cdata[jdhfd dsfodf wernksdlg ffguywer  <br> dsf sddsfdsfdsf ksdfj fdsf]]></phrase> </phrases>""")  print root.tag >>>'phrases'  in root:     print i.text  >>>bacd dsfbsd dfsd ans fkdfjid dfdf jdhfd dsfodf wernksdlg ffguywer  <br> dsf sddsfdsfdsf ksdfj fdsf   in root:     print i.attrib  >>>{'version_string': '1.1.0 alpha', 'version_id': '1010010', 'title': 'bacd_dd'} {'version_string': '1.1.0 alpha', 'version_id': '1010010', 'title': 'bcvd_ff'} {'version_string': '1.0.0 alpha 7', 'version_id': '1000017', 'title': 'bdsd_fffd'} 

if need of parse xml file.

import xml.etree.elementtree et tree = et.parse('file.xml') root = tree.getroot() 

for more refer https://docs.python.org/2/library/xml.etree.elementtree.html


Comments

Popular posts from this blog

java - Could not locate OpenAL library -

c++ - Delete matches in OpenCV (Keypoints and descriptors) -

sorting - opencl Bitonic sort with 64 bits keys -