If I had that problem, I would convert as many of them as I could to HTML, run 'tidy' to clean up the HTML,
and then run the DocParse tool from www.commmandprompt.com to convert them to DocBook.
That sounds like the easiest option, since the majority of them are
in HTML format. Tidy can get everything to the same version of HTML.
Somewhere I have a perl script that converts plain ASCII to HTML
2.0. Tidy can clean up and upgrade the results to 4.01.
Most of the formatted non-HTML, non-plain ASCII documents can be
converted to HTML using whatever created them in that format.
There is the irony of converting to HTML, then to DocBook, then back
to HTML so that it can be seen on the web.