Open Office vs Ms Word

ICE, MS Office, Open Office No Comments

Lots of blogs and forum post about Open Office performance issue. ICE, the project that I am working on now is using Open Office to handle the process from rendition to pdf, html and combining multiple documents (Ms Word or Neo Office) to build a book. With 512 RAM, testing building book or rendition is real pain when I need to handle more than 600 pages of book. I can sit in front of computer for 1 hour plus just to generate one book (not the rendition part yet) and can not do anything else because Open office just freeze my computer as Open Office fully use the memory resources. After the book being generated, it takes another 15 minutes to open the document and after it is opened, each page load takes at least 5 to 10 seconds. Ms words wins in the performance issue. Opening the same document only takes 3 minutes and loading pages take lass than 2 seconds.

Why use Open Office?
It’s free!!! A part from performance, Open Office is Open Source. Paragraph and styling issues found in Ms words and be fixed with Open Office. ICE extends the templates and micros for Open Office extension (I am not saying Ms words can not provide such service, but the implementation is easier in Open Office). Manipulating Open Office documents are easier as it’s in a zip file form (although the xml file in Open Office zip is not in a good structure). As Python 2.4 is the core program used to develop ICE, all document being process through python program provided by Open Office (Open Office should consider to upgrade to at least Python 2.4 as currently it uses Python 2.3 as I have some issue that I posted in my previous blog).
Open Office generate a nice pdf file (a part from the issue when it needs to handle MathType object where most of the object being “squeezed” in the pdf document). Open Office doesn’t handle HTML rendition properly. We use state diagram to generate our own HTML file based on the style.xml and content.xml provided by Open Office (Can we do it in Ms Word? Need to try on Ms Office 2007 as it contains zip file as well). Open Office can open Ms Office document (and some other document format) and convert it to odt format, it never complain like Ms Office when opening Open Office document. Image inline and caption support provided by Open Office is very useful for HTML format generation.

Bugs in Open Office that found so far:

  • Memory leaking when using uno to handle book mark and table of content automatically in Open Office
  • Generating list with nested list in Open Office must be done properly, if not Open Office just get confuse and screw up the nesting of the list. Try: type in text in the first line, then apply first level list to that line, then text in the second line, then apply second level list and lastly text in the third line and apply first level list to that line. Generate the html file for this document and see the source, you will find a lot of nesting for the list
  • Images in table are not displayed properly in the generated pdf file

Links:

Exporting to pdf from Open office Writer

MS Office, Open Office, xml No Comments

One of main reason for me to choose Open Office over Microsoft Words is that I can convert my document to PDF without installing (or buying) other software. As for my work at Uni, Open Office is main tools used by our clients to create their documents and I have been hacking around with Open Office xml files to meet the customer expectation.

When I play around with Open Office fonts today, I found out one bug when I try to export one of my client’s document to PDF. The document is successfully exported to PDF, but some of the characters are missing in the PDF file. I have been sitting in front of my computer try to figure out why the characters are gone where as when I created a new document with those characters, they appear in the PDF file. Sitting the whole two hours trying to compare the style.xml (I crashed vim and crimson editor for at least 20 times because my box has only 512 RAM and style.xml consists too many lines) and content.xml files, and finally I managed to work out why the characters are not included in the PDF file.

You can replicate the problem by doing these step: :D

  1. Create writer document
  2. Put some text in with an “en dash” (–) character. You can insert this character from open office using <Alt> + <0150> on your number keypad or <Ctrl> + <-> (May not work in some dialog boxes).
  3. Select all the text (including “en dash” character) and change the font to “Helvetica” (only).
  4. Save the document and start to export the document to PDF
  5. Open the PDF file and your “en dash characters are gone”

The error is found for document created in Words opened and exported in Open Office with the above font as well. I don’t know if this is an open office bug as “Helvetica” (standard “Arial” font in Neo Office) is not belong to font list (there is only “Helvetica-Narrow”) because after I replace the style to Helvetica-Narrow the “en dash” appear in the PDF.

Modification that I made in style.xml file is changing:

<style:text-properties style:font-name=”Helvetica” fo:font-size=”8.5pt” fo:font-style=”normal”/>

to:

<style:text-properties style:font-name=”Helvetica-Narrow” fo:font-size=”8.5pt” fo:font-style=”normal”/>

I try to search of the issue report in open office, and seems like this issue had been reported previously related Helvetica included in PDF export in Issue 81970. I have reported the issue to Open Office (Issue 82540) and still waiting for the response ;) . My client insists on using Helvetica (only) font even Arial has the same view because changing the font name may break other system that use the same document.

Few hours later….

As I predicted, Open office closed Issue 82540 and suggested to use other font that is supported by Open office like “Helvetica-Narrow”. Well, I think that will be the solution for my case for a time being. :(