This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: Re: Any Doc to XML converter ?


Well, I understand why Microsoft thinks this (although I violently
disagree): 

> From a recent MSDN article "Export a Word Document to XML" by 
> Kevin McDowell
> (http://msdn.microsoft.com/library/techart/odc_expwordtoxml.htm)
> 
> "The XML output by this application is very straightforward 
> and very similar to the
> HTML output by Word itself, but it fully accounts for all 
> styled text, tables, and
> lists. "
> 
> and
> 
> "Conclusion
> This solution provides a starting point to build an XML 
> parser for Word documents.
> In addition to the XML functionality, it discusses how to 
> build custom objects to
> handle sequential instances of all styles and graphics and 
> how to loop through
> tables and lists. Remember, documents shouldn't be converted 
> to XML merely for the
> sake putting them in XML. The best document to convert to XML 
> is one that makes use
> of styles and will be reused in other ways."

But from my perspective that is grossly misleading. The HTML that
Word exports is trash full of Microsoft specific extensions that
most web sites don't want. So if they're XML is similar, that's
not saying much. 

And it --completely-- ignores something even more fundamental. Which 
is that most people using Word to create documents could care less 
about good structure, consistency, and much of the modelling that 
makes information truly reusable. If you start with trash, guess what
you end up? So the XML from Word isn't going to get people the benefit
they think (and that this article implies). 

Sara Mitchell

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]