Friend Me
 Follow Me
 Feed Me
a blog by ken pardue

Archive for January, 2008

On ODF vs. OOXML

Wednesday, January 30th, 2008

There is an important vote coming up in the International Organization for Standardization on whether or not to promote Microsoft’s in-house developed Office Open XML (OOXML) to the status of being an accepted international standard. Should Microsoft’s format receive this blessing, it will be free and clear to lean heavily on its corporate girth in the direction of government and other agencies that require the formats that they are saving data in to be fully documented and open.

I’m not whole-heartedly opposing this standpoint as an anti-Microsoft or pro-underdog statement, but rather because this move puts a terrible risk on the long-term viability of very important data. I am an amateur genealogist. Preserving my research is extremely important to me. I need to know that in thirty years or if I migrate to another platform I will be able to open the word processing documents containing my research. For government agencies, the issue is clear. The government already has hundreds of thousands of records dating to the 1970’s that are inaccessible due to obsolete data storage formats on tape. There simply no longer exists a device that will read them. It is inexcusable for this to happen again.

Microsoft claims that its own format is an open spec, meeting the needs of government agencies. However, the same company has demonstrated behaviors that are monopolistic, often resulting in vendor lock-in. It also retains the only chair on the technical committee overseeing the development of OOXML, and has reserved the right to make its own custom version of the specification (should it not agree with the recommendations of the technical committee). This has already happened with Microsoft Office 2007, which uses a variation of the OOXML spec that is already obsoleted by the changes Microsoft has had to make to the format since it failed the fast-track vote last year after being considered dangerously flawed. The clearest reasoning that the body developing the OOXML spec cannot and should not be trusted is the clearly cited evidence that Microsoft tried to buy votes for OOXML in Sweden and exhibited strong-arm pressure amongst other voting bodies.

In order for a specification to be considered “open,” in my opinion it must be vendor neutral, easy to understand, and openly implemented or implementable on a variety of platforms. To date, the Microsoft specification consists of over 6,000 pages, some of which is incomplete and encompasses older, binary specs from Microsoft’s former Office suite formats. It pushes techniques using the .NET framework and VisualBasic, neither of which are open or implemented on other platforms.

This next point should be separate from this post, because this post is primarily about the format. There is a separation between the format (OOXML) and the application implementing the format (Microsoft Office). Let’s divulge and use the application as a case in point. Microsoft says that the complexity of and inclusion of binary blobs in OOXML are primarily to facilitate backwards compatibility. In theory this is a good thing. This whole post is about being able to open documents saved thirty years in the past. In reality, though, Microsoft has demonstrated the exact opposite of this behavior by disabling the ability to open no less than twenty four older file formats in Microsoft Office 2007, calling them inferior. This included Microsoft’s own default format of the then-current Microsoft Office for Mac release. It later apologized and said that what it really meant was it’s way of opening the files was inferior, but the spin couldn’t deny the fact that Microsoft had willfully removed a function from a program that allowed it to work with competing formats and even some of its own. It has already been demonstrated that most of the functionality can be converted without the tie-down to proprietary technology. Virtually all functionality could be converted were Microsoft to open up the specification to their binary formats, as has been requested.

There are alternatives already in place and here today. The current format war puts Microsoft’s OOXML in competition with OpenDocument format, or ODF. ODF is a vendor-neutral specification based on an easily readable syntax that has already gone through the standards process. It was approved as an international standard two years ago, in fact. It is developed and maintained by OASIS, an organization composed of 600 member organizations, the largest of which are IBM, Sun, Novell, and Google. Until recently, Microsoft was also a member of the organization.

There are tools to read and write ODF already implemented in multi-platform solutions, primarily in OpenOffice.org but the specification is evolving independently. A rapidly growing list of other applications are implementing the specification (including Microsoft Office) in large part due to the clarity and elegance of the specification. What OOXML accomplishes in 6,000+ pages, ODF accomplishes and exceeds in 2,500+. OpenDocument includes clear specifications for word processing documents, spreadsheets, presentations, graphics, and formulas. What’s more, it’s rapidly becoming a specification to build other specifications on. There is currently an effort to develop an OpenDocument-compatible raster graphics storage format called OpenRaster.

In closing, I would urge readers to not subscribe to the FUD that Microsoft is spreading and not follow blindly along a path just because it’s easy to say, “Well, Microsoft and Office are big enough that there will always be a way to open the files.” Think about where we were just ten years ago, about how much has changed. Look at the diminishing market share of Microsoft to rivals such as Apple and Firefox. Think about how important the long term viability of your information is. And think about data portability. Support open formats with open and clear direction.

A report was recently published by the Burton Group that collects nearly all of the misconceptions about ODF that Microsoft is spreading, a report that was completely debunked by the OpenDocument Alliance. It’s worth reading simply to know the facts.