02 December 2007
In his blogpost Rob find uses the Google search engine to determine that there is less than 2,000 Office Open XML files available on the internet compared to 160,000 ODF files. That is interesting because it is totally ridiculous. The numbers by Google do not add up at all as it shows no increase in OOXML files at all and just looking at them made me very suspicious. I tried a similar search of docx files using the Live search search engine:
An example of this Google blindness for instance this link:http://blogpictures.members.winisp.net/saas.pptx which at this time can be found trough Live search and trough Yahoo search but not trough Google search.
So where Rob using Google can not find more than 2,000 actual Office Open XML files I can easily find 124,000 pages that contain one or more Office Open XML files.
It becomes easy to manipulate the figures when Google is on your side ?
21 July 2007
Stangly enough it seems that Gary Edwards who actually was a member of the OASIS TC developing ODF has a totally different view on why ODF has no feature for being compatible with Micrsoft Office documents.:
"For the near five years that i have been a member of the OASIS ODF TC, Sun has opposed any and all efforts to improve interoperability with Microsoft applications, documents, and bound workgroup-workflow business processes.
This goes all the way back to the very first TC meeting on December 14th, 2002, when the enterprise publication, content and archive management systems contingent of the OASIS TC wanted the "proposed" ODF charter amended to include as one of the primary objectives, "compatibility with existing file formats and interoperability with existing applications".
And yes, that proposed charter change specifically included compatibility and interoperability with Microsoft applications, documents and processes!!
Sun opposed that change and has consistently opposed all interoperability enhancements since."
But luckily it is still all Microsofts fault, at least according to Gary:
"Someone needs to go back to that 2004 agreement between Microsoft and Sun. You know, the one that saved Sun the company! There is clear evidence, stretching throughout the years of ODF discussions, that Sun has traded ODF universal interoperability for a sweet sweet hardware deal with Microsoft. Overwhelming evidence."
Which is a bit strange because he also claims that Sun blocking interoperability with MS office document started from the day the charter was made in 2002 when Sun and Microsoft were still bitter rivals in court.
So what about ODF interoperability with ODF now:
"There are three characteristics Sun has steadfastly opposed. And now we finally have an explanation other than that the StarOffice Hamburg group was terminally "stuck in 1995".
These characteristics are important because the world is not a clean slate. Microsoft Office controls over 95% of the existing documents, applications and bound workgroup-workflow business processes.
Without these three bridge characteristics, ODf becomes impossible to implement given where the world today finds itself – 95% bound to MSOffice:
... Compatibility with existing file formats – including MS binary documents
... Interoperability with existing applications – including MSOffice applications
... Convergence :: the application-platform-vendor independent portable file format ability to fluidly and transparently transition desktop-server-device-web information systems.
Sun's opposition to and failure to support the interoperability enhancements to ODf that would have addressed these concerns is a matter of public record"
A sound of bitter disappointment in the lack of interoperability in ODF but also and especially disappointment in Sun being responsible for it
"But what about those of us who really believed that ODf could become that elusive universal file format, and spent years trying?"
And reffering to Sun:
"They sold us out!"
01 May 2007
When reading trough the a lot of those issues raised by the national bodies a lot of the issues seem to be directly related to stories written down by IBM's Rob Weir and OASIS laywer Andy Updegrove. Also it seems like several of the reactions by the national bodies have almost identical issues raised in almost identical sentencing. It might look like a lot of the issues are cwritten by the same people or copied from them.
A strange development in this is that in the Kenyan response to ISO the author seems to be an IBM employee from Germany who is also representing IBM in the German ISO national body (DIN). http://blogs.msdn.com/brian_jones/archive/2007/04/20/a-few-updates-on-the-openxml-formats.aspx . The 'Kenyan' response is the most extensive of of the ISO national body reponses and mayby not surprisingly contains a lot of the issues that were before raised by IBM's Rob Weir.
This is all the more surprising as it seems that the German IBM employees also activly tried to persuade the German DIN committe to write a negative response to ISO.
The question is then how the IBM issues written by IBM germanies DIN member that were not raised in the German response have seemed to end up in the Kenyan response.
Is the Kenyan ISO national body easier to 'influence' by IBM .... ????
16 March 2007
So is binary data within XML a thing of the past.
No, not really, you can still find it here:
Yes, right in the middle of the Opendocument.
It is called the office:binary-data element.
It refers to a base64 binary encoded element within the XML.
So why does no one ever mention that ODF can contain binary data within the XML ?
11 March 2007
It seems like many people do not understand the purpose of ISO fasttracking standardization. The fasttracking method is a standardization process to easily guide existing industry standards into an ISO standard. It is not meant for creating a new standard from scratch but to get existing technology which has a broad basis into ISO.
ISO standards are meant to be used and ISO has a pragmatic policy that it needs to provide standards that are have a market requirement. See also this post by Rick Jeliffe on ISO standards: http://www.oreillynet.com/xml/blog/2007/02/what_is_a_standard_at_iso_1.html
OOXML is a standard originating in the MS Office 2003 XML formats, SpreadsheetML, WordprocessingML and PresentationML, formats which first showed in the august 2000 Office XP beta. This means that the formats are already well established in describing the features of Office documents. Important differences to those old formats in OOXML are the use of the Open Packaging Convention and the Markup Compatibility and Extensibility and putting embedded (binary) files in as separate files in the package and remove the binary content from the XML files. Also the OOXML specs old markup languages have been augmented to take tons of examples and a good structuring for implementation by using parent and child elements being defined everywhere. And finally the VML vector language format is being replaced by the new DrawingML format (although VML is kept in the spec for compatibility with the MS Office 2003 XML formats that used VML).
This is an example of why the format is not perfect. It carries with it the burden of backwards compatibility and the amount of extensive features being used in MS Office. But it also combines this with a spec that has a lot of possible issues in it already dealt with. It would be very hard to stamp out a completely new format that could be used by an Office suite like MS Office. Certainly a spec like ODF which is pretty good would not suffice as it leaves a lot of things still undefined or up to implementation. ODF is improving this by improving it's specs and extending them in newer versions and also by trying to create a set of reference documents to further define it's implementations. Still it is has proved very hard to get an implementation that could implement the entire ODF spec even 20 months after it has been standardized by OASIS even without looking at formula's. http://testsuite.opendocumentfellowship.org/summary.html . This shows that implementing a complex Office spec even when you already have a full Office suite as a basis is a process that can takes years.
This is basically what Microsoft has done with it's Office format. It has taken more than 6 years to put it's Office suite towards a full XML implementation that is also backwards compatible with it's billions of legacy documents. Then it has opened up this format for everyone to use by standardizing it trough Ecma. It does carry some scars from that development but it also is a spec that is in full use with many of it's key markup elements like the formula's having proven themselves over a longer period of time. In ISO fasttracking that is important as it show that the format has pedigree in the real world and that it has use in the market of today.
OOXML is a format that has a foundation in the market and has most of it's features proven in the last 6 years. It does not have perfection but then again neither has it's competitor that is still being worked upon. For the ISO national bodies that market foundation and proven track record are important aspects for approving this format. The Ecma standard does have several newer less proven elements for sure, which have to be considered in this process, but as of now, those new elements are the elements that seem to be least criticised which also show that the development of the format is improving toward the future.
The ISO fasttracking process does not need perfection but it relies on a standard that is based on existing technology that will be massivly used in the future. This is exactly what OOXML will prove to be and I think that is why ISO will approve this standard as an ISO standard despite the protests from the open source community and MS competitors.
05 March 2007
I was banned from Groklaw because of excessive commenting and using unacceptable language. I do admit having posted quite a few comments on Groklaw. Probably more than 50 or so in the last month. How should I know what would be excessive. I consider the quality of Groklaw objections information to be very poor and one-sided and the initiative of Groklaw to start a mailing campaign towards ISO as very pathetic and that might have shown in lots of critical commenting on Groklaw on which people often refer to the objections pages like if it were a bible. A few quick comments are easily made then.
I do have serious objections though against being accused of unacceptable language. the worst I remember is probably stating once or twice that something is complete nonsense or bull or something in that order. Nothing which would cause anyone a headache. I have posted on probably about 30 blogs or sites about OOXML in the last half year and most bloggers seem very open to another opinion about ooxml. The 'not so open' blog that is Groklaw however isn't to pleased with pro-ooxml comments. I'd wish they themselves were a bit more open about that.
And if anyone from Groklaw reads this. Yes, I use multiple IP addresses as me, my girlfriend, my parents, my friend and my work and sometimes an available open wireless have different addresses. I mostly post anonymous as registering everywhere is a lot of effort nowadays and Groklaw also removed quite a few posts that were clearly signed with The Wraith and a link to this blog. But PJ, it is fine if you and you minions want to censor the answers on groklaw to be just pro-Groklaw. I'll find other platforms that do appreciate reasonable two-way discussion on the OOXML debate.
02 March 2007
The Ecma response in PDF form can be found here:
It also contains the issues raised by the national bodies.
Astounding is that the comments of the national bodies sometimes seem to be directly copied from the Grokdoc objections pages. The suggestions on that Grokdoc pages are mostly dubious at best with comment by for instance XML expert Rick Jelliffe on the discussion pages ignored.
25 February 2007
For those that consider Microsoft was seriously asked to participate I will show you the OASIS document asking other parties in OASIS participate just to you show how that suggestion is totally wrong:
Look at the name of the committee then in November 2002.
"Open Office XML TC"
Notice that the name was later changed to Open Document to lose the obvious connection to OpenOffice in the name of the format.
Also from that call to participate so even before the TC starts:
"Since the OpenOffice.org XML format specification meets these criteria and has proven its value in real life, this TC will use it as the basis for its work.
Sun Microsystems intends to contribute the OpenOffice.org XML Format to this TC at the first meeting of the TC, under reciprocal Royalty Free terms."
So a set of preset criteria matching the OOo format specs and Sun contributing that format to the OASIS committee 1 month before the start of the committee. Yes, that is extremely open way to ensure that this TC only ever was about the OOo format
Also the call to participate already contains a full members suggestion list with four Sun employees in the proposed committee but of course no-one of market leader Microsoft.
Does this look like a committee that Microsoft could have ever joined or if anybody seriously wanted them to ? No, of course not.
This was before governments for instance in Massachusetts suggested open formats might be required for governments and also before the EU suggested ISO standardisation.
This was at a time when OpenOffice was looking for interoperability amongst OSS. Interoperability with MS Office was even left behind from the first meeting of the TC:
"The TC agreed that transformability into potential Microsoft office XML formats could be sensible, but is not a formal requirement."
Not that I want to state that Microsoft would have been easily persuaded to participate to such a document standardisation process. There was only limited reason for Microsoft to use a standardised format in MS Office as nobody was really demanding it and as at the time they were already well on their way to making their own XML based format for MS Office. That format was already shown two years earlier in august 2000 in an Office XP beta that contained a early version of SpreadsheetML which is currently still a part of Ecma Office Open XML.
So the OASIS effort by Sun and other OpenOffice supporters might in 2002 also be seen as an answer to the Microsoft XML formats that were starting to come out of MS Office and might become commonplace as they are now with the introduction of default XML formats in Office 2007.
For Microsoft to participate in a standardisation process it would have required a least the following 3 things.
- Firstly a need to standardize the format, like pressure from governments or their customers. I think that the transfer to the MS Office XML formats starting in 2000 already showed a slight move toward their customers in opening up data in their binary format to a more readable format but probably no need for formal standardization was considered or even asked for at that time.
- Secondly the format would need good backwards compatibility with the billions of existing MS Office documents. It would be very hard to push trough an Office product that did not have backwards compatibility written all over it.
- Thirdly it would need to support MS Office functionality as much as possible without immediately extending the format.
Seriously, Microsoft participation on the "Open Office XML TC" was never going to happen based on a short term particpation call for a TC working that pre-decided on OpenOffice and was dominated by a competitor.
People suggesting otherwise, that Microsoft actually could have participated on such a venture are just bullshitting their audience.
20 February 2007
Andy mentions in his blog post "The Contradictory Nature of OOXML" several issues from the OOXML spec like 126.96.36.199 "Embedded Object Alternate Image Requests Types" refrencing WMF files and 188.8.131.52 "Clipboard Format Types". Further down he mentions for instance OLE embedding and macro/scripts. It is of course correctly seen that the OOXML spec mentions these thing in the specifications. But does it mean that implementing these feature is required if you implement OOXML ?
No of course not.
Firstly, as I have already mentioned in an earlier post, when implementing OOXML you are not required to implement any parts other than stated in the conformance paragraph. You can leave out anything you want.
Secondly, even if you create a FULL implementation of OOXML then still you do not need to implement these features as these references are all references to external formats that are therfore not a part of OOXML. They are references to fully optional formats that you can embed just like you can embed image formats or other media formats.
Thirdly, implementing external formats is is something that is application specific and is not limited to MS formats. You can in fact embed any propriety or open format into OOXML files. OOXML only uses some of the mentioned references to distinguish between format types so you can differentiate between an embedded image file and an embedded windows meta file or a clipboard file. A method to distinguish between foreign file types however is not the same as adding the external format to the spec. If it would, than if a simple line in the spec states "filetype:picture" it requires support for all picture formats in existence ???
Fourthly, you can embed the same formats that are mentioned in OpenDocument also. ODF has about 6 methods of embedding external foreign file formats in ODF files. You can embed the same Windows Meta Files or clipboard file formats just as easy in ODF as you can in OOXML. You can use macro/scripts in ODF and, yes, even even use OLE linking/embedding of external file formats in ODF. Mayby Andy should have complained about ODF supporting all of those formats as well during ISO standardisation of ODF. Especially the fact that Sun, next to regular external object embedding has managed to add specific embedded java objects to ODF (§9.3.4) seems quite weird.
Basically the opponents of OOXML claim that embedding foreign files is fine for ODF but it isn't fine when files are embedded in OOXML because then Microsoft should have licensed those formats even though it might be about exactly the same files.
Is that either highly hypocritical criticism or just ignorance ?
I think I know the answer to that...
19 February 2007
The issue is about the OOXML elements in section 2.15 of the OOXML specs. This section describes compatibility items for documents created in the past with older office versions or even with WordPerfect. The tags should only exist in converted legacy documents and should not be used in any new documents.
The first thing to say is that OOXML does not require any implementation to support these tags. There is little reason to support these tags for virtually any application.
Secondly, IBM's Rob Weir argues that "these legacy tags are some of the most important ones in the specification". Of course he would because he wants to OOXML to look bad. The fact is that applications like OpenOffice currently manage fine in converting most .doc files without supporting these compatibility features If implementing the compatibility tags is really so important then people could interprete Rob's statement that you should never use an implementation that doesn't support full compatibility features with legacy documents and stay with what you have.
Thirdly, the compatibility tags are about rendering Office documents. Rendering behaviour is described in OOXML (or ODF for that matter). It would be very strange to exactly describe how the compatibility tags rendering would have to be, while the normal rendering is not described. Office document rendering is always implementation specific.
Fourthly, these tags describe that converted documents may have rendered differently when they were originally created in their original application. Even for applications that are not interested in recreating that actual rendering behaviour these tags can be valuable. When opening a file and parsing these tags an implementing application can give the user a warning that the document looked different in it's original form. This could be important for evaluating conversions of documents that may require very faithful representations like notary documents or documents that archive printed works.
Ah, but ODF does not have this legacy compatibility shit. Why put it into a standard ? ODf has chosen a different approach for dealing with legacy compatibility. It just doesn't define it but leave it implementation specific.
For instance OpenOffice uses custom application settings like:
If Microsoft would use ODF then they can use the "Application settings" to create fully standard compliant documents and still make them impossible to be interoperable. At least in OOXML the extend of the compatibility is known and when a compatibility issue is found it can be recognized and the uses warned accordingly. In ODF files when using the "Application settings" for compatibility the only thing you can do is to warn the user that the document can behave differently because it has separate undefined settings for different applications.
Two solutions for ugly old settings. Both can lead to very ugly documents. But at least the OOXML stuff is only for the past, the ODF "application settings" can keep haunting us for a long time to come...
18 February 2007
The basics of the openness of a standard in the Intellectual Property (IP) rights. This breaks down into copyrights granted by rights to the author of a work and the patent rights which are granted to patentholders.
For OOXML is the copyrights are now in the hands of Ecma international which created and published the standard. Microsoft contributed to this but has no copyrights to the Ecma standard.
Ecma makes all it's standards available for free:
"Ecma Standards are made available to all interested persons or organizations, free of charge and copyright, in printed form and, as files in Acrobat ® PDF format."
So on the copyright side everything is covered and the standard is as open as it can be. So how about the patent rights. As a standard that originates from technology created by Microsoft the most likely party to have patents that are relevant to OOXML will be Microsoft. To make sure that the standard is not hampered by the burden of implementation which conflicts these possible patents Micrsoft has added a covenant not to sue and later added the Ecma standard to the covered formats of it's Open Specification Promise.
"Microsoft irrevocably promises not to assert any Microsoft Necessary Claims against you for making, using, selling, offering for sale, importing or distributing any implementation to the extent it conforms to a Covered Specification"
In software patents in general patent claims are the methods or system that implement an invention. So basically Microsoft cannot use it patent claim against anyone that requires their patented methods or systems to implements the OOXML format even for commercial use.
Opponent of OOXML has tried to raise the question of what it means that an implementation conforms to a specification but actually OOXML contains a section which clearly states what a conforming implementation is. Describing conformance is of course a matter for Ecma that controls the standard and is not up to Microsoft.
Another point that is often made is that Microsoft's release of its patent claim are not given for future version of the standard where it seems that Sun which has used a similar covenant not to sue on patent claims related to ODF has stated it to be for future versions as well. As the standard no longer belongs to Microsoft it would be very hard to make a statement about future versions of the standard unless they could control what exactly would be in the future versions.
But how about Sun then ?
Strangely enough Sun has made sure that it controls what is in next versions of the ODF standard. Sun's covenant not to sue hold a strange provision that makes sure that for quite a while to come their control over OpenDocument development is secured. Their CNS is limited to: "any subsequent version thereof ("OpenDocument Implementation") in which development Sun participates to the point of incurring an obligation". So Sun's covenant only applies to future version if they participate in development. So if Sun's does not like the development of ODF it can hold up the development of the standard until there is certainty that it does not violate any of sun's patents.
This is quite a big deal. Let's say that Microsoft were to start using ODF (as unlikely as it seems atm) and joined the OASIS TC to help adding an Office database format to ODF. But then Oracle buys up Sun and all it patents and they decide that they don't like a database format added to ODF. Then they could severely block any development and mayby halt the ODF development for years if not alltogehter. This makes it extremely unlikely that Microsoft will give full support to ODF while Sun still hold control over the development.
Because MS has not releases it's claims for future version it has similar control over OOXML as Sun does over ODF. However there is a big difference. Microsoft needs development of the Office format as it is vital to it's core business unlike for instance Sun or IBM.
They need a certain amount of control over the Office format development as if they did not have such control then their competitors would find it easy enough to stifle further development of MS Office technologies and make it easier to catch up.
For Microsoft to not develop newer versions of OOXML or even going back to a closed format in MS Office would be like shooting themselves if the foot.
To conclude I would say that OOXML is an open standard in almost the same way that ODF is open.
The advantage of ODF might be that it will have a broader group of development support (incl. Sun of course) whereas OOXML has the advantage that they have the market leader supporting any new development which can therefore be fairly quick and implementation of a new version can be very rapidly expanded to a wide customerbase which makes it interesting for commercial support.
17 February 2007
Issues are taken with the fact that it would be to long to implement by other parties and it would be to long for the ISO fasttracking procedure to evaluate the specification.
Firstly I will go into the fact that the spec would be to long to implement.
The OOXML specification is meant to support the features of the biggest Office application and be compatible with the billions of documents produced by that application in the last 15 years.
Ecma and Microsoft think that requires 6k pages. I think they are probably short.
No one can implement such a complex spec on just 6k pages.
Luckily they can use MS Office 2007 document as a reference.
But the ODF spec then with its 800 pages ?
A big advantage of the ODF spec is that they do not need truly faithful support for their existing documents as much as OOXML does with probably 99% of Office documents being created with MS Office. OOXML spends at least 1000 pages on compatibility especially with adding the VML spec.
Also the ODF spec isn't really 8800 pages to implement is it:
Here in the bottom of this article by Miquel de Icaza http://http//tirania.org/blog/archive/2007/Jan-30.html it suggests, with reusing several w3c standards and with OOXML using a bigger line spacing in the specs, that implementing ODF specs would cost almost the same amount of actual specification to be implemented as OOXML.
Let's face it. Building an complete Office application with all the trimming is a very very big task no matter what. It is unlikely that more than a few original application will ever exist that can fully use both ODF and OOXML formats and it takes big projects to make that happen. And I bet that any such project would rather have double the spec size to describe the specs even more precise than have less documentation to work with.
Secondly several parties notably IBM and Groklaw and claimed that the 6000 page OOXML spec cannot be evaluated by ISO national bodies in the short time of a fasttrack procedure.
Let's first look at the fasttrack procedure. How long is it ? The shortest possible time for a fasttrack procedure seems to be about 8,5 months with the average probably closer to a year if the standard is undisputed. Disputes can lenghten the time even more.
Still it would be quite a short time if everyone had to evaluate every page carefully for each little item on it. But is that necessary ? No, not really.
A fasttracking procedure is a procedure for a standard of existing technology. The standard is has been developed and maintained by a third party. It is not a brandnew developement that is unproven and does not require the same amount of scrutiny. Also in the period prior to the standardisation process ISO already advises the third party on how to make the standard acceptable for ISO standardization (ISO had a liaison in the Ecma technical committee for instance and the OOXML draft documents were already provided to ISO from may 2006 onwards).
Fasttracking is more a descisionmaking process than a development process.
In fasttracking ISO national bodies can look at the usefulness and the need for the standard, can look at if the standard does not contradict an ISO standard and the overall quality of the standard. Sometimes amendments/alterations /clarification might be suggested and can be made by the third party in charge of the standard.
Although there might be some minor issues it seems clear that the only really important question in the fasttracking proces of OOXML is whether this new standard contradicts the ODF ISO standard. For the minor issues Ecma can make alterations or give a road map in which versions and when the alteration can be made (similar to adding formula's in ODF mayby).
For the contradiction with ODF you do not need 6000 pages. You need to look only at some small parts of the specs specifying the essentials of the OOXML spec. In fact you need only look at the Part 1 (Fundamentals) and the part 3 (primer) parts of the specs to evaluate the specs for a decision making process. These parts (550 pages) describe the goals, the basic ideas of the specs, the conformance, in short everything to determine the need and usefulness of the standard format.
I think that the basic idea of the OOXML spec lies in continuation, stability and compatibility and in opening up the legacy office data. These are the powerful arguments that form the basis of the OOXML standardisation process. The should not be underestimated. Governments, organisations and companies and individuals have invested a lot in Office documents in the last 15 years.
In what seems to have become format war between IBM, Sun and FOSS vs Microsoft I will try to add that is what I think is a more moderate view on what is really going on with the formats.
In this blog I will try to write about some of the issues that I have come across in the information campaigns about the formats leading up to the standardisation of the Office document formats by ISO.
Is the information I provide biased ?
Yes, it probably will be a bit. At the moment especially OOXML received a lot of unjust criticism and I will certainly try to give a more positive look on OOXML than what is written by the people from IBM and Sun.