This is the mail archive of the docbook@lists.oasis-open.org mailing list for the DocBook project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: MS files included with elements?


Galen Boyer wrote:

> I guess it might look that way, but I don't think I am.  I know
> what the author's say in the book as well, and I see the point of
> giving up control of formatting and style and why this is such a
> pain point.  But that seems to apply when one is starting the
> document in docbook.  It doesn't seem to apply when one is
> converting existing wysiwyg documentation.

Well, I think I could add a bit here.
I live in Spain, and the last two years I worked on Larousse publishing
company converting an existing encyclopedia of 16 volumes into SGML.
I can say that converting an existing text into SGML is not an easy
task.
SGML is structured and formal. Humans aren't structured and formal all
the time.
That means youl find perhaps a 10% of the document that is not
straightforwardly convertible into SGML. It wll have to be modified.

> Most of the time, in a wysiwyg editor, it is pretty clear why
> someone has used a certain emphasis.  Looking at a wysiwyg page,
> one should be able to make the translation to docbook because
> there is meaning to the original authors wysiwyg organization and
> style.  I just am lost sometimes, because I know the meaning the
> author of the word doc is trying to get across but can't come up
> with the corresponding tag which would support that meaning.

This is just a matter of a)knowing the meaning of all the tags
b)adapting the DTD to your needs

> I will get it in time, but I was hoping there was some reference
> point for making those translations easier.  It is probably very
> similar to trying to translate between languages when their isn't
> a direct one-to-one translation.

I think there can be no help on it.
Just one question. You say, those translations. ¿Perhaps you are
translating many documents?
If this is the case, I would consider wasting a bit of time in some
automation.
In my company we used Omnimark, a very good program to be used with
SGML. It was free at those times, but now it has to be bought.
Anyway, if you have many documents of the same kind, you can convert
them into ASCII or RTF or anything you know, and use a pattern matching
method, finding text and adding markup.
If the documents are well writen and the patters correct, you will
perhaps get a 75% of the work done.

> Yes, I am currently trying to work with catdoc and catdoc.el for
> this purpose, but I end up having to look at the word doc for the
> formatting to get the "meaning" :-(

You will always have to go back to the source to get the meaning of it,
of course.

> Sort of like, if I gave you the list,
>
> ...
>
> How would you, without the common body of knowledge, know that
> the list should have had the indentation
>
> ...
>
> Without the wysiwyg, I get the first list.  The wysiwyg gives me
> the second list.  It is then my job to convert that to chapters
> and sections in docbook.

Withougt the wysiwyg, you would need a furtune-teller.


------------------------------------------------------------------
To unsubscribe from this elist send a message with the single word
"unsubscribe" in the body to: docbook-request@lists.oasis-open.org


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]