02 December 2007

OOXML: Google search engine supports IBM FUD campaign ?

IBM's Rob Weir recently created a blogpost on the use of Office Open XML files in the real world. Strangely enough he uses the Google search engine searching the web to prove dat OOXML is not really used in the world. Is this because he understands that ODF is really more semi w3c webformat than a serieus Office format? A more respectable way to look at Office documents would be to ask companies using Office software but apperantly IBM thinks differently. But it get even Weirder.

In his blogpost Rob find uses the Google search engine to determine that there is less than 2,000 Office Open XML files available on the internet compared to 160,000 ODF files. That is interesting because it is totally ridiculous. The numbers by Google do not add up at all as it shows no increase in OOXML files at all and just looking at them made me very suspicious. I tried a similar search of docx files using the Live search search engine:
http://search.live.com/results.aspx?q=contains%3Adocx&mkt=us-us (69,000)
http://search.live.com/results.aspx?q=contains%3Apptx&mkt=us-us (41,000)
http://search.live.com/results.aspx?q=contains%3Axlsx&mkt=us-us (14,000)

An example of this Google blindness for instance this link:http://blogpictures.members.winisp.net/saas.pptx which at this time can be found trough Live search and trough Yahoo search but not trough Google search.

So where Rob using Google can not find more than 2,000 actual Office Open XML files I can easily find 124,000 pages that contain one or more Office Open XML files.

It becomes easy to manipulate the figures when Google is on your side ?

21 July 2007

OOXML hoax 8: ODF interoperability with MS documents

According to many opponents of Office Open XML the lack of interoperability between Microsoft documents formats and the new ODF format is all because Microsoft did not cooperate with the OASIS TC when they were developing ODF.
Stangly enough it seems that Gary Edwards who actually was a member of the OASIS TC developing ODF has a totally different view on why ODF has no feature for being compatible with Micrsoft Office documents.:

"For the near five years that i have been a member of the OASIS ODF TC, Sun has opposed any and all efforts to improve interoperability with Microsoft applications, documents, and bound workgroup-workflow business processes.
This goes all the way back to the very first TC meeting on December 14th, 2002, when the enterprise publication, content and archive management systems contingent of the OASIS TC wanted the "proposed" ODF charter amended to include as one of the primary objectives, "compatibility with existing file formats and interoperability with existing applications".
And yes, that proposed charter change specifically included compatibility and interoperability with Microsoft applications, documents and processes!!
Sun opposed that change and has consistently opposed all interoperability enhancements since."


But luckily it is still all Microsofts fault, at least according to Gary:
"Someone needs to go back to that 2004 agreement between Microsoft and Sun. You know, the one that saved Sun the company! There is clear evidence, stretching throughout the years of ODF discussions, that Sun has traded ODF universal interoperability for a sweet sweet hardware deal with Microsoft. Overwhelming evidence."
Which is a bit strange because he also claims that Sun blocking interoperability with MS office document started from the day the charter was made in 2002 when Sun and Microsoft were still bitter rivals in court.

So what about ODF interoperability with ODF now:
"There are three characteristics Sun has steadfastly opposed. And now we finally have an explanation other than that the StarOffice Hamburg group was terminally "stuck in 1995".
These characteristics are important because the world is not a clean slate. Microsoft Office controls over 95% of the existing documents, applications and bound workgroup-workflow business processes.
...
Without these three bridge characteristics, ODf becomes impossible to implement given where the world today finds itself – 95% bound to MSOffice:
... Compatibility with existing file formats – including MS binary documents
... Interoperability with existing applications – including MSOffice applications
... Convergence :: the application-platform-vendor independent portable file format ability to fluidly and transparently transition desktop-server-device-web information systems.
Sun's opposition to and failure to support the interoperability enhancements to ODf that would have addressed these concerns is a matter of public record"


A sound of bitter disappointment in the lack of interoperability in ODF but also and especially disappointment in Sun being responsible for it
"But what about those of us who really believed that ODf could become that elusive universal file format, and spent years trying?"
And reffering to Sun:
"They sold us out!"

01 May 2007

Has IBM annexed the Kenyan ISO National body ?

During the 30-day contradictions period in the ISO standardization proces 20 countries responded several of which were concerns about the OOXML fasttrack standardisation procedure and also a few a view with an already positive stance on the format. About 6 of the reponses also contained negative responses on the format itself raising several specific issues with the format specifications.

When reading trough the a lot of those issues raised by the national bodies a lot of the issues seem to be directly related to stories written down by IBM's Rob Weir and OASIS laywer Andy Updegrove. Also it seems like several of the reactions by the national bodies have almost identical issues raised in almost identical sentencing. It might look like a lot of the issues are cwritten by the same people or copied from them.

A strange development in this is that in the Kenyan response to ISO the author seems to be an IBM employee from Germany who is also representing IBM in the German ISO national body (DIN). http://blogs.msdn.com/brian_jones/archive/2007/04/20/a-few-updates-on-the-openxml-formats.aspx . The 'Kenyan' response is the most extensive of of the ISO national body reponses and mayby not surprisingly contains a lot of the issues that were before raised by IBM's Rob Weir.

This is all the more surprising as it seems that the German IBM employees also activly tried to persuade the German DIN committe to write a negative response to ISO.
http://www.ictstandardization.com/news/200704/article20070406.html

The question is then how the IBM issues written by IBM germanies DIN member that were not raised in the German response have seemed to end up in the Kenyan response.
Is the Kenyan ISO national body easier to 'influence' by IBM .... ????

16 March 2007

OOXML hoax 7: The binary blob

Regularly when OOXML is discussed it is referred to as a XML that can contain binary blob. In the past the Office 2003 XML formats did indeed contains embedded binary date within the file. This of course due to the fact that this format consisted of a single XML file. With OOXML a big change is now that the binary embedded formats are put in the package as separate files.

So is binary data within XML a thing of the past.
No, not really, you can still find it here:
http://http://docs.oasis-open.org/office/v1.1/OS/OpenDocument-v1.1.pdf
Yes, right in the middle of the Opendocument.
It is called the office:binary-data element.
It refers to a base64 binary encoded element within the XML.

So why does no one ever mention that ODF can contain binary data within the XML ?

11 March 2007

OOXML hoax 6: ISO fasttracking requires a perfect format

In the Office document format war there is a lot of critique on the OOXML format specifications especially now that Ecma has submitted the format for ISO fasttracking.

It seems like many people do not understand the purpose of ISO fasttracking standardization. The fasttracking method is a standardization process to easily guide existing industry standards into an ISO standard. It is not meant for creating a new standard from scratch but to get existing technology which has a broad basis into ISO.

ISO standards are meant to be used and ISO has a pragmatic policy that it needs to provide standards that are have a market requirement. See also this post by Rick Jeliffe on ISO standards: http://www.oreillynet.com/xml/blog/2007/02/what_is_a_standard_at_iso_1.html
OOXML is a standard originating in the MS Office 2003 XML formats, SpreadsheetML, WordprocessingML and PresentationML, formats which first showed in the august 2000 Office XP beta. This means that the formats are already well established in describing the features of Office documents. Important differences to those old formats in OOXML are the use of the Open Packaging Convention and the Markup Compatibility and Extensibility and putting embedded (binary) files in as separate files in the package and remove the binary content from the XML files. Also the OOXML specs old markup languages have been augmented to take tons of examples and a good structuring for implementation by using parent and child elements being defined everywhere. And finally the VML vector language format is being replaced by the new DrawingML format (although VML is kept in the spec for compatibility with the MS Office 2003 XML formats that used VML).

This is an example of why the format is not perfect. It carries with it the burden of backwards compatibility and the amount of extensive features being used in MS Office. But it also combines this with a spec that has a lot of possible issues in it already dealt with. It would be very hard to stamp out a completely new format that could be used by an Office suite like MS Office. Certainly a spec like ODF which is pretty good would not suffice as it leaves a lot of things still undefined or up to implementation. ODF is improving this by improving it's specs and extending them in newer versions and also by trying to create a set of reference documents to further define it's implementations. Still it is has proved very hard to get an implementation that could implement the entire ODF spec even 20 months after it has been standardized by OASIS even without looking at formula's. http://testsuite.opendocumentfellowship.org/summary.html . This shows that implementing a complex Office spec even when you already have a full Office suite as a basis is a process that can takes years.

This is basically what Microsoft has done with it's Office format. It has taken more than 6 years to put it's Office suite towards a full XML implementation that is also backwards compatible with it's billions of legacy documents. Then it has opened up this format for everyone to use by standardizing it trough Ecma. It does carry some scars from that development but it also is a spec that is in full use with many of it's key markup elements like the formula's having proven themselves over a longer period of time. In ISO fasttracking that is important as it show that the format has pedigree in the real world and that it has use in the market of today.

OOXML is a format that has a foundation in the market and has most of it's features proven in the last 6 years. It does not have perfection but then again neither has it's competitor that is still being worked upon. For the ISO national bodies that market foundation and proven track record are important aspects for approving this format. The Ecma standard does have several newer less proven elements for sure, which have to be considered in this process, but as of now, those new elements are the elements that seem to be least criticised which also show that the development of the format is improving toward the future.

The ISO fasttracking process does not need perfection but it relies on a standard that is based on existing technology that will be massivly used in the future. This is exactly what OOXML will prove to be and I think that is why ISO will approve this standard as an ISO standard despite the protests from the open source community and MS competitors.

The Wraith

05 March 2007

Banned from Groklaw

A first for me in 20 years of Internet. I got banned from a public discussion.
I was banned from Groklaw because of excessive commenting and using unacceptable language. I do admit having posted quite a few comments on Groklaw. Probably more than 50 or so in the last month. How should I know what would be excessive. I consider the quality of Groklaw objections information to be very poor and one-sided and the initiative of Groklaw to start a mailing campaign towards ISO as very pathetic and that might have shown in lots of critical commenting on Groklaw on which people often refer to the objections pages like if it were a bible. A few quick comments are easily made then.

I do have serious objections though against being accused of unacceptable language. the worst I remember is probably stating once or twice that something is complete nonsense or bull or something in that order. Nothing which would cause anyone a headache. I have posted on probably about 30 blogs or sites about OOXML in the last half year and most bloggers seem very open to another opinion about ooxml. The 'not so open' blog that is Groklaw however isn't to pleased with pro-ooxml comments. I'd wish they themselves were a bit more open about that.

And if anyone from Groklaw reads this. Yes, I use multiple IP addresses as me, my girlfriend, my parents, my friend and my work and sometimes an available open wireless have different addresses. I mostly post anonymous as registering everywhere is a lot of effort nowadays and Groklaw also removed quite a few posts that were clearly signed with The Wraith and a link to this blog. But PJ, it is fine if you and you minions want to censor the answers on groklaw to be just pro-Groklaw. I'll find other platforms that do appreciate reasonable two-way discussion on the OOXML debate.

The Wraith

02 March 2007

Ecma responded to OOXML contradictions

Ecma has responded to the issues raised by the national bodies during the 30-contradictions period in the fasttracking process.

The Ecma response in PDF form can be found here:
http://www.computerworld.com/pdfs/Ecma.pdf
It also contains the issues raised by the national bodies.

Astounding is that the comments of the national bodies sometimes seem to be directly copied from the Grokdoc objections pages. The suggestions on that Grokdoc pages are mostly dubious at best with comment by for instance XML expert Rick Jelliffe on the discussion pages ignored.
http://www.grokdoc.net/index.php/Talk:EOOXML_objections#Bitmasks_cause_significant_validation_problems

25 February 2007

OOXML hoax 5: Microsoft could have participated in ODF development

For those that consider Microsoft was seriously asked to participate I will show you the OASIS document asking other parties in OASIS participate just to you show how that suggestion is totally wrong:
http://lists.oasis-open.org/archives/tc-announce/200211/msg00001.html
Look at the name of the committee then in November 2002.
"Open Office XML TC"
Notice that the name was later changed to Open Document to lose the obvious connection to OpenOffice in the name of the format.

Also from that call to participate so even before the TC starts:
"Since the OpenOffice.org XML format specification meets these criteria and has proven its value in real life, this TC will use it as the basis for its work.
Sun Microsystems intends to contribute the OpenOffice.org XML Format to this TC at the first meeting of the TC, under reciprocal Royalty Free terms."

So a set of preset criteria matching the OOo format specs and Sun contributing that format to the OASIS committee 1 month before the start of the committee. Yes, that is extremely open way to ensure that this TC only ever was about the OOo format

Also the call to participate already contains a full members suggestion list with four Sun employees in the proposed committee but of course no-one of market leader Microsoft.

Does this look like a committee that Microsoft could have ever joined or if anybody seriously wanted them to ? No, of course not.
This was before governments for instance in Massachusetts suggested open formats might be required for governments and also before the EU suggested ISO standardisation.
This was at a time when OpenOffice was looking for interoperability amongst OSS. Interoperability with MS Office was even left behind from the first meeting of the TC:
"The TC agreed that transformability into potential Microsoft office XML formats could be sensible, but is not a formal requirement."

Not that I want to state that Microsoft would have been easily persuaded to participate to such a document standardisation process. There was only limited reason for Microsoft to use a standardised format in MS Office as nobody was really demanding it and as at the time they were already well on their way to making their own XML based format for MS Office. That format was already shown two years earlier in august 2000 in an Office XP beta that contained a early version of SpreadsheetML which is currently still a part of Ecma Office Open XML.
So the OASIS effort by Sun and other OpenOffice supporters might in 2002 also be seen as an answer to the Microsoft XML formats that were starting to come out of MS Office and might become commonplace as they are now with the introduction of default XML formats in Office 2007.

For Microsoft to participate in a standardisation process it would have required a least the following 3 things.

  • Firstly a need to standardize the format, like pressure from governments or their customers. I think that the transfer to the MS Office XML formats starting in 2000 already showed a slight move toward their customers in opening up data in their binary format to a more readable format but probably no need for formal standardization was considered or even asked for at that time.
  • Secondly the format would need good backwards compatibility with the billions of existing MS Office documents. It would be very hard to push trough an Office product that did not have backwards compatibility written all over it.
  • Thirdly it would need to support MS Office functionality as much as possible without immediately extending the format.
The OASIS call for participation by the Sun/OpenOffice dominated TC would not have been relevant to Microsoft on any of those three issues. There was never a chance that on such a call Microsoft would have chosen to participate working based on an OpenOffice format from a month later.

Seriously, Microsoft participation on the "Open Office XML TC" was never going to happen based on a short term particpation call for a TC working that pre-decided on OpenOffice and was dominated by a competitor.
People suggesting otherwise, that Microsoft actually could have participated on such a venture are just bullshitting their audience.

The Wraith

20 February 2007

OOXML hoax 4: The standard requires supporting of propriety MS formats

Another issue raised by IBM and by standards and OSS lawyer Andy Upgrove is that the OOXML specifications would require the implementers to also implement propriety formats and methods of Microsoft that are not licensed under the OOXML licensing.

Andy mentions in his blog post "The Contradictory Nature of OOXML" several issues from the OOXML spec like 6.2.3.17 "Embedded Object Alternate Image Requests Types" refrencing WMF files and 6.4.3.1 "Clipboard Format Types". Further down he mentions for instance OLE embedding and macro/scripts. It is of course correctly seen that the OOXML spec mentions these thing in the specifications. But does it mean that implementing these feature is required if you implement OOXML ?
No of course not.

Firstly, as I have already mentioned in an earlier post, when implementing OOXML you are not required to implement any parts other than stated in the conformance paragraph. You can leave out anything you want.

Secondly, even if you create a FULL implementation of OOXML then still you do not need to implement these features as these references are all references to external formats that are therfore not a part of OOXML. They are references to fully optional formats that you can embed just like you can embed image formats or other media formats.

Thirdly, implementing external formats is is something that is application specific and is not limited to MS formats. You can in fact embed any propriety or open format into OOXML files. OOXML only uses some of the mentioned references to distinguish between format types so you can differentiate between an embedded image file and an embedded windows meta file or a clipboard file. A method to distinguish between foreign file types however is not the same as adding the external format to the spec. If it would, than if a simple line in the spec states "filetype:picture" it requires support for all picture formats in existence ???

Fourthly, you can embed the same formats that are mentioned in OpenDocument also. ODF has about 6 methods of embedding external foreign file formats in ODF files. You can embed the same Windows Meta Files or clipboard file formats just as easy in ODF as you can in OOXML. You can use macro/scripts in ODF and, yes, even even use OLE linking/embedding of external file formats in ODF. Mayby Andy should have complained about ODF supporting all of those formats as well during ISO standardisation of ODF. Especially the fact that Sun, next to regular external object embedding has managed to add specific embedded java objects to ODF (§9.3.4) seems quite weird.

Basically the opponents of OOXML claim that embedding foreign files is fine for ODF but it isn't fine when files are embedded in OOXML because then Microsoft should have licensed those formats even though it might be about exactly the same files.
Is that either highly hypocritical criticism or just ignorance ?
I think I know the answer to that...

19 February 2007

OOXML hoax 3: The standard requires cloning of old propriety behaviour

This suggestion probably originates from IBM's blogger Rob Weir and his Guillaume Portes article which is one of the most cited articles in articles about OOXML.

The issue is about the OOXML elements in section 2.15 of the OOXML specs. This section describes compatibility items for documents created in the past with older office versions or even with WordPerfect. The tags should only exist in converted legacy documents and should not be used in any new documents.

The first thing to say is that OOXML does not require any implementation to support these tags. There is little reason to support these tags for virtually any application.

Secondly, IBM's Rob Weir argues that "these legacy tags are some of the most important ones in the specification". Of course he would because he wants to OOXML to look bad. The fact is that applications like OpenOffice currently manage fine in converting most .doc files without supporting these compatibility features If implementing the compatibility tags is really so important then people could interprete Rob's statement that you should never use an implementation that doesn't support full compatibility features with legacy documents and stay with what you have.

Thirdly, the compatibility tags are about rendering Office documents. Rendering behaviour is described in OOXML (or ODF for that matter). It would be very strange to exactly describe how the compatibility tags rendering would have to be, while the normal rendering is not described. Office document rendering is always implementation specific.

Fourthly, these tags describe that converted documents may have rendered differently when they were originally created in their original application. Even for applications that are not interested in recreating that actual rendering behaviour these tags can be valuable. When opening a file and parsing these tags an implementing application can give the user a warning that the document looked different in it's original form. This could be important for evaluating conversions of documents that may require very faithful representations like notary documents or documents that archive printed works.

Ah, but ODF does not have this legacy compatibility shit. Why put it into a standard ? ODf has chosen a different approach for dealing with legacy compatibility. It just doesn't define it but leave it implementation specific.
For instance OpenOffice uses custom application settings like:
UseFormerLineSpacing
UseFormerObjectPositioning
UseFormerTextWrapping
UseOldNumbering

and a lot more...
So OOXML has a lot of deprecated tags that almost impossible to implement but ODF has actually created a special standard tag for stuff that you can't possibly implement. ODF has actually standardized "Application setting" tagging that cannot be interoperable because any such 'standard' tags will not be defined. and this stuff is not deprecated. You can make new documents with it.
If Microsoft would use ODF then they can use the "Application settings" to create fully standard compliant documents and still make them impossible to be interoperable. At least in OOXML the extend of the compatibility is known and when a compatibility issue is found it can be recognized and the uses warned accordingly. In ODF files when using the "Application settings" for compatibility the only thing you can do is to warn the user that the document can behave differently because it has separate undefined settings for different applications.

Two solutions for ugly old settings. Both can lead to very ugly documents. But at least the OOXML stuff is only for the past, the ODF "application settings" can keep haunting us for a long time to come...