This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
RE: Re: Any Doc to XML converter ?
- To: xsl-list at lists dot mulberrytech dot com
- Subject: RE: [xsl] Re: Any Doc to XML converter ?
- From: sara dot mitchell at ps dot ge dot com
- Date: Tue, 19 Jun 2001 13:35:03 -0400
- Reply-To: xsl-list at lists dot mulberrytech dot com
Well, I understand why Microsoft thinks this (although I violently
disagree):
> From a recent MSDN article "Export a Word Document to XML" by
> Kevin McDowell
> (http://msdn.microsoft.com/library/techart/odc_expwordtoxml.htm)
>
> "The XML output by this application is very straightforward
> and very similar to the
> HTML output by Word itself, but it fully accounts for all
> styled text, tables, and
> lists. "
>
> and
>
> "Conclusion
> This solution provides a starting point to build an XML
> parser for Word documents.
> In addition to the XML functionality, it discusses how to
> build custom objects to
> handle sequential instances of all styles and graphics and
> how to loop through
> tables and lists. Remember, documents shouldn't be converted
> to XML merely for the
> sake putting them in XML. The best document to convert to XML
> is one that makes use
> of styles and will be reused in other ways."
But from my perspective that is grossly misleading. The HTML that
Word exports is trash full of Microsoft specific extensions that
most web sites don't want. So if they're XML is similar, that's
not saying much.
And it --completely-- ignores something even more fundamental. Which
is that most people using Word to create documents could care less
about good structure, consistency, and much of the modelling that
makes information truly reusable. If you start with trash, guess what
you end up? So the XML from Word isn't going to get people the benefit
they think (and that this article implies).
Sara Mitchell
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list