«

»

Dec
13

Python ElementTree

Elementtree provides a nice XML document handling in python. I found elementtree is useful to handle my xhtml document. I can easily insert a new element in whichever part of the xhtml document. When I generate html file for LaTeX through tex4ht (my previous blog), the html file is not xhtml format. Since I want to use elementtree to handling adding and removing tags element in my html document, I use libtidy package by ubuntu to tidy up my invalid html document and the use elementtree to handle the rest of manipulation process in my html file.

Small elementtree packages like cElementTree can be used to handle simple xml document. The issue with cElementTree is that it doesn’t support namespaces naming convention like elementtree does. So when we generate xml document through cElementTree with namespace URI provided, it will generate a not-nice but valid xml document:

<ns0:mods xmlns:mods="http://www.loc.gov/mods/v3">
   <ns0:titleInfo>
       <ns0:title>%s</ns0:title>
   </ns0:titleInfo>
   <ns0:author>name</ns0:author>
</ns0:mods>

Elementtree gives better xml document result. With prefix registration:

ElementTree._namespace_map[MODS_NS] = "mods"

It will generate:

<mods:mods xmlns:mods="http://www.loc.gov/mods/v3">
   <mods:titleInfo>
       <mods:title>%s</mods:title>
   </mods:titleInfo>
   <mods:author>name</mods:author>
</mods:mods>

Permanent link to this article: http://lindaocta.com/?p=33

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>