25 February 2007

OOXML hoax 5: Microsoft could have participated in ODF development

For those that consider Microsoft was seriously asked to participate I will show you the OASIS document asking other parties in OASIS participate just to you show how that suggestion is totally wrong:
http://lists.oasis-open.org/archives/tc-announce/200211/msg00001.html
Look at the name of the committee then in November 2002.
"Open Office XML TC"
Notice that the name was later changed to Open Document to lose the obvious connection to OpenOffice in the name of the format.

Also from that call to participate so even before the TC starts:
"Since the OpenOffice.org XML format specification meets these criteria and has proven its value in real life, this TC will use it as the basis for its work.
Sun Microsystems intends to contribute the OpenOffice.org XML Format to this TC at the first meeting of the TC, under reciprocal Royalty Free terms."

So a set of preset criteria matching the OOo format specs and Sun contributing that format to the OASIS committee 1 month before the start of the committee. Yes, that is extremely open way to ensure that this TC only ever was about the OOo format

Also the call to participate already contains a full members suggestion list with four Sun employees in the proposed committee but of course no-one of market leader Microsoft.

Does this look like a committee that Microsoft could have ever joined or if anybody seriously wanted them to ? No, of course not.
This was before governments for instance in Massachusetts suggested open formats might be required for governments and also before the EU suggested ISO standardisation.
This was at a time when OpenOffice was looking for interoperability amongst OSS. Interoperability with MS Office was even left behind from the first meeting of the TC:
"The TC agreed that transformability into potential Microsoft office XML formats could be sensible, but is not a formal requirement."

Not that I want to state that Microsoft would have been easily persuaded to participate to such a document standardisation process. There was only limited reason for Microsoft to use a standardised format in MS Office as nobody was really demanding it and as at the time they were already well on their way to making their own XML based format for MS Office. That format was already shown two years earlier in august 2000 in an Office XP beta that contained a early version of SpreadsheetML which is currently still a part of Ecma Office Open XML.
So the OASIS effort by Sun and other OpenOffice supporters might in 2002 also be seen as an answer to the Microsoft XML formats that were starting to come out of MS Office and might become commonplace as they are now with the introduction of default XML formats in Office 2007.

For Microsoft to participate in a standardisation process it would have required a least the following 3 things.

  • Firstly a need to standardize the format, like pressure from governments or their customers. I think that the transfer to the MS Office XML formats starting in 2000 already showed a slight move toward their customers in opening up data in their binary format to a more readable format but probably no need for formal standardization was considered or even asked for at that time.
  • Secondly the format would need good backwards compatibility with the billions of existing MS Office documents. It would be very hard to push trough an Office product that did not have backwards compatibility written all over it.
  • Thirdly it would need to support MS Office functionality as much as possible without immediately extending the format.
The OASIS call for participation by the Sun/OpenOffice dominated TC would not have been relevant to Microsoft on any of those three issues. There was never a chance that on such a call Microsoft would have chosen to participate working based on an OpenOffice format from a month later.

Seriously, Microsoft participation on the "Open Office XML TC" was never going to happen based on a short term particpation call for a TC working that pre-decided on OpenOffice and was dominated by a competitor.
People suggesting otherwise, that Microsoft actually could have participated on such a venture are just bullshitting their audience.

The Wraith

20 February 2007

OOXML hoax 4: The standard requires supporting of propriety MS formats

Another issue raised by IBM and by standards and OSS lawyer Andy Upgrove is that the OOXML specifications would require the implementers to also implement propriety formats and methods of Microsoft that are not licensed under the OOXML licensing.

Andy mentions in his blog post "The Contradictory Nature of OOXML" several issues from the OOXML spec like 6.2.3.17 "Embedded Object Alternate Image Requests Types" refrencing WMF files and 6.4.3.1 "Clipboard Format Types". Further down he mentions for instance OLE embedding and macro/scripts. It is of course correctly seen that the OOXML spec mentions these thing in the specifications. But does it mean that implementing these feature is required if you implement OOXML ?
No of course not.

Firstly, as I have already mentioned in an earlier post, when implementing OOXML you are not required to implement any parts other than stated in the conformance paragraph. You can leave out anything you want.

Secondly, even if you create a FULL implementation of OOXML then still you do not need to implement these features as these references are all references to external formats that are therfore not a part of OOXML. They are references to fully optional formats that you can embed just like you can embed image formats or other media formats.

Thirdly, implementing external formats is is something that is application specific and is not limited to MS formats. You can in fact embed any propriety or open format into OOXML files. OOXML only uses some of the mentioned references to distinguish between format types so you can differentiate between an embedded image file and an embedded windows meta file or a clipboard file. A method to distinguish between foreign file types however is not the same as adding the external format to the spec. If it would, than if a simple line in the spec states "filetype:picture" it requires support for all picture formats in existence ???

Fourthly, you can embed the same formats that are mentioned in OpenDocument also. ODF has about 6 methods of embedding external foreign file formats in ODF files. You can embed the same Windows Meta Files or clipboard file formats just as easy in ODF as you can in OOXML. You can use macro/scripts in ODF and, yes, even even use OLE linking/embedding of external file formats in ODF. Mayby Andy should have complained about ODF supporting all of those formats as well during ISO standardisation of ODF. Especially the fact that Sun, next to regular external object embedding has managed to add specific embedded java objects to ODF (§9.3.4) seems quite weird.

Basically the opponents of OOXML claim that embedding foreign files is fine for ODF but it isn't fine when files are embedded in OOXML because then Microsoft should have licensed those formats even though it might be about exactly the same files.
Is that either highly hypocritical criticism or just ignorance ?
I think I know the answer to that...

19 February 2007

OOXML hoax 3: The standard requires cloning of old propriety behaviour

This suggestion probably originates from IBM's blogger Rob Weir and his Guillaume Portes article which is one of the most cited articles in articles about OOXML.

The issue is about the OOXML elements in section 2.15 of the OOXML specs. This section describes compatibility items for documents created in the past with older office versions or even with WordPerfect. The tags should only exist in converted legacy documents and should not be used in any new documents.

The first thing to say is that OOXML does not require any implementation to support these tags. There is little reason to support these tags for virtually any application.

Secondly, IBM's Rob Weir argues that "these legacy tags are some of the most important ones in the specification". Of course he would because he wants to OOXML to look bad. The fact is that applications like OpenOffice currently manage fine in converting most .doc files without supporting these compatibility features If implementing the compatibility tags is really so important then people could interprete Rob's statement that you should never use an implementation that doesn't support full compatibility features with legacy documents and stay with what you have.

Thirdly, the compatibility tags are about rendering Office documents. Rendering behaviour is described in OOXML (or ODF for that matter). It would be very strange to exactly describe how the compatibility tags rendering would have to be, while the normal rendering is not described. Office document rendering is always implementation specific.

Fourthly, these tags describe that converted documents may have rendered differently when they were originally created in their original application. Even for applications that are not interested in recreating that actual rendering behaviour these tags can be valuable. When opening a file and parsing these tags an implementing application can give the user a warning that the document looked different in it's original form. This could be important for evaluating conversions of documents that may require very faithful representations like notary documents or documents that archive printed works.

Ah, but ODF does not have this legacy compatibility shit. Why put it into a standard ? ODf has chosen a different approach for dealing with legacy compatibility. It just doesn't define it but leave it implementation specific.
For instance OpenOffice uses custom application settings like:
UseFormerLineSpacing
UseFormerObjectPositioning
UseFormerTextWrapping
UseOldNumbering

and a lot more...
So OOXML has a lot of deprecated tags that almost impossible to implement but ODF has actually created a special standard tag for stuff that you can't possibly implement. ODF has actually standardized "Application setting" tagging that cannot be interoperable because any such 'standard' tags will not be defined. and this stuff is not deprecated. You can make new documents with it.
If Microsoft would use ODF then they can use the "Application settings" to create fully standard compliant documents and still make them impossible to be interoperable. At least in OOXML the extend of the compatibility is known and when a compatibility issue is found it can be recognized and the uses warned accordingly. In ODF files when using the "Application settings" for compatibility the only thing you can do is to warn the user that the document can behave differently because it has separate undefined settings for different applications.

Two solutions for ugly old settings. Both can lead to very ugly documents. But at least the OOXML stuff is only for the past, the ODF "application settings" can keep haunting us for a long time to come...

18 February 2007

OOXML hoax 2: The standard is not really open

In many comments I have read, people, especially from the FOSS community, have stated that OOXML is not really an open standard and that Microsoft controls the format.

The basics of the openness of a standard in the Intellectual Property (IP) rights. This breaks down into copyrights granted by rights to the author of a work and the patent rights which are granted to patentholders.
For OOXML is the copyrights are now in the hands of Ecma international which created and published the standard. Microsoft contributed to this but has no copyrights to the Ecma standard.
Ecma makes all it's standards available for free:
"Ecma Standards are made available to all interested persons or organizations, free of charge and copyright, in printed form and, as files in Acrobat ® PDF format."

So on the copyright side everything is covered and the standard is as open as it can be. So how about the patent rights. As a standard that originates from technology created by Microsoft the most likely party to have patents that are relevant to OOXML will be Microsoft. To make sure that the standard is not hampered by the burden of implementation which conflicts these possible patents Micrsoft has added a covenant not to sue and later added the Ecma standard to the covered formats of it's Open Specification Promise.
"Microsoft irrevocably promises not to assert any Microsoft Necessary Claims against you for making, using, selling, offering for sale, importing or distributing any implementation to the extent it conforms to a Covered Specification"
In software patents in general patent claims are the methods or system that implement an invention. So basically Microsoft cannot use it patent claim against anyone that requires their patented methods or systems to implements the OOXML format even for commercial use.
Opponent of OOXML has tried to raise the question of what it means that an implementation conforms to a specification but actually OOXML contains a section which clearly states what a conforming implementation is. Describing conformance is of course a matter for Ecma that controls the standard and is not up to Microsoft.

Another point that is often made is that Microsoft's release of its patent claim are not given for future version of the standard where it seems that Sun which has used a similar covenant not to sue on patent claims related to ODF has stated it to be for future versions as well. As the standard no longer belongs to Microsoft it would be very hard to make a statement about future versions of the standard unless they could control what exactly would be in the future versions.
But how about Sun then ?
Strangely enough Sun has made sure that it controls what is in next versions of the ODF standard. Sun's covenant not to sue hold a strange provision that makes sure that for quite a while to come their control over OpenDocument development is secured. Their CNS is limited to: "any subsequent version thereof ("OpenDocument Implementation") in which development Sun participates to the point of incurring an obligation". So Sun's covenant only applies to future version if they participate in development. So if Sun's does not like the development of ODF it can hold up the development of the standard until there is certainty that it does not violate any of sun's patents.

This is quite a big deal. Let's say that Microsoft were to start using ODF (as unlikely as it seems atm) and joined the OASIS TC to help adding an Office database format to ODF. But then Oracle buys up Sun and all it patents and they decide that they don't like a database format added to ODF. Then they could severely block any development and mayby halt the ODF development for years if not alltogehter. This makes it extremely unlikely that Microsoft will give full support to ODF while Sun still hold control over the development.

Because MS has not releases it's claims for future version it has similar control over OOXML as Sun does over ODF. However there is a big difference. Microsoft needs development of the Office format as it is vital to it's core business unlike for instance Sun or IBM.
They need a certain amount of control over the Office format development as if they did not have such control then their competitors would find it easy enough to stifle further development of MS Office technologies and make it easier to catch up.
For Microsoft to not develop newer versions of OOXML or even going back to a closed format in MS Office would be like shooting themselves if the foot.

To conclude I would say that OOXML is an open standard in almost the same way that ODF is open.

The advantage of ODF might be that it will have a broader group of development support (incl. Sun of course) whereas OOXML has the advantage that they have the market leader supporting any new development which can therefore be fairly quick and implementation of a new version can be very rapidly expanded to a wide customerbase which makes it interesting for commercial support.

The Wraith

17 February 2007

OOXML hoax 1: The spec is too long.

The Office Open XML specification is around 6000 pages long compared to the OpenDocument specification which is only 800 pages long.

Issues are taken with the fact that it would be to long to implement by other parties and it would be to long for the ISO fasttracking procedure to evaluate the specification.

Firstly I will go into the fact that the spec would be to long to implement.
The OOXML specification is meant to support the features of the biggest Office application and be compatible with the billions of documents produced by that application in the last 15 years.
Ecma and Microsoft think that requires 6k pages. I think they are probably short.
No one can implement such a complex spec on just 6k pages.
Luckily they can use MS Office 2007 document as a reference.
But the ODF spec then with its 800 pages ?
A big advantage of the ODF spec is that they do not need truly faithful support for their existing documents as much as OOXML does with probably 99% of Office documents being created with MS Office. OOXML spends at least 1000 pages on compatibility especially with adding the VML spec.
Also the ODF spec isn't really 8800 pages to implement is it:
Here in the bottom of this article by Miquel de Icaza http://http//tirania.org/blog/archive/2007/Jan-30.html it suggests, with reusing several w3c standards and with OOXML using a bigger line spacing in the specs, that implementing ODF specs would cost almost the same amount of actual specification to be implemented as OOXML.

Let's face it. Building an complete Office application with all the trimming is a very very big task no matter what. It is unlikely that more than a few original application will ever exist that can fully use both ODF and OOXML formats and it takes big projects to make that happen. And I bet that any such project would rather have double the spec size to describe the specs even more precise than have less documentation to work with.

Secondly several parties notably IBM and Groklaw and claimed that the 6000 page OOXML spec cannot be evaluated by ISO national bodies in the short time of a fasttrack procedure.
Let's first look at the fasttrack procedure. How long is it ? The shortest possible time for a fasttrack procedure seems to be about 8,5 months with the average probably closer to a year if the standard is undisputed. Disputes can lenghten the time even more.
Still it would be quite a short time if everyone had to evaluate every page carefully for each little item on it. But is that necessary ? No, not really.
A fasttracking procedure is a procedure for a standard of existing technology. The standard is has been developed and maintained by a third party. It is not a brandnew developement that is unproven and does not require the same amount of scrutiny. Also in the period prior to the standardisation process ISO already advises the third party on how to make the standard acceptable for ISO standardization (ISO had a liaison in the Ecma technical committee for instance and the OOXML draft documents were already provided to ISO from may 2006 onwards).
Fasttracking is more a descisionmaking process than a development process.
In fasttracking ISO national bodies can look at the usefulness and the need for the standard, can look at if the standard does not contradict an ISO standard and the overall quality of the standard. Sometimes amendments/alterations /clarification might be suggested and can be made by the third party in charge of the standard.
Although there might be some minor issues it seems clear that the only really important question in the fasttracking proces of OOXML is whether this new standard contradicts the ODF ISO standard. For the minor issues Ecma can make alterations or give a road map in which versions and when the alteration can be made (similar to adding formula's in ODF mayby).
For the contradiction with ODF you do not need 6000 pages. You need to look only at some small parts of the specs specifying the essentials of the OOXML spec. In fact you need only look at the Part 1 (Fundamentals) and the part 3 (primer) parts of the specs to evaluate the specs for a decision making process. These parts (550 pages) describe the goals, the basic ideas of the specs, the conformance, in short everything to determine the need and usefulness of the standard format.

I think that the basic idea of the OOXML spec lies in continuation, stability and compatibility and in opening up the legacy office data. These are the powerful arguments that form the basis of the OOXML standardisation process. The should not be underestimated. Governments, organisations and companies and individuals have invested a lot in Office documents in the last 15 years.

The Wraith

Why this blog ?

The last few months I have become increasingly annoyed by the disinformation spread about the office formats Office Open XML (OOXML) and OpenDocument (ODF). Especially the enormous amount of poor information about Office Open XML is staggering so I guess a bit of the focus will be on providing information on OOXML.
In what seems to have become format war between IBM, Sun and FOSS vs Microsoft I will try to add that is what I think is a more moderate view on what is really going on with the formats.

In this blog I will try to write about some of the issues that I have come across in the information campaigns about the formats leading up to the standardisation of the Office document formats by ISO.

Is the information I provide biased ?
Yes, it probably will be a bit. At the moment especially OOXML received a lot of unjust criticism and I will certainly try to give a more positive look on OOXML than what is written by the people from IBM and Sun.

The Wraith