Hi Graham:
Yes. It's a tool called "Word" :-)
Save the document as "Web Page" (better: as "XML").
Web Page saves as XHTML, which is somewhat easier for humans to read. XML
is easier for machines to read. Either of them will get you the entire
content of the document,
Alternatively, you can save the document as RTF. RTF is very close to
Word's native format. It's huge and very convoluted, but if you open RTF in
a Text editor, you can see exactly what's in there.
Note that a Word document can include a large number of binary objects such
as graphics.
A couple of caveats:
1) If there's anything much wrong with the document, Automation cannot read
it either. Automation depends upon the internal collections in the document
object model being present and intact. If they're not, the object model
collapses.
2) Even if you get the data out, it becomes a huge job to try to analyse
what's wrong. Many of the problems you get in a Word document are due to
excessive levels of abstraction overflowing internal buffers. The code may
be "legal", but it becomes so complex that Word runs out of memory trying to
read it.
XML gives you your best shot: if you get the document out to XML, you can
read and correct most things if you know WordML very well. Regrettably, if
there's much wrong with the document, the XML output filter will fail to
complete the save.
Sorry!
On 17/8/06 8:56 AM, in article #yBXlcYwGHA.2204@TK2MSFTNGP03.phx.gbl,
> Folks:
>
[quoted text clipped - 7 lines]
>
> Graham

Signature
Please reply to the newsgroup to maintain the thread. Please do not email
me unless I ask you to.
John McGhie <john@mcghie.name>
Microsoft MVP, Word and Word for Macintosh. Consultant Technical Writer
Sydney, Australia +61 (0) 4 1209 1410