Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
DiscussionsAccessExcelInfoPathOutlookPowerPointPublisherWord
DirectoryUser Groups
Related Topics
Outlook ExpressInternet ExplorerWindowsMS Server ProductsMore Topics ...

MS Office Forum / Word / Long Documents / August 2006

Tip: Looking for answers? Try searching our database.

Word doc analyzer tool?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Graham Wideman [Visio MVP] - 16 Aug 2006 23:56 GMT
Folks:

Are there any useful tools for analyzing and troubleshooting Word docs?

I'm thinking along the lines of a tool that might read (via Automation) all
the data in the Word document model, and present all the data in some
intelligent fashion -- suitable for troubleshooting oddball problems.

Anything in that neighborhood?

Graham
John McGhie [MVP - Word and Word Macintosh] - 17 Aug 2006 13:34 GMT
Hi Graham:

Yes.  It's a tool called "Word" :-)

Save the document as "Web Page" (better: as "XML").

Web Page saves as XHTML, which is  somewhat easier for humans to read.  XML
is easier for machines to read.  Either of them will get you the entire
content of the document,

Alternatively, you can save the document as RTF.  RTF is very close to
Word's native format.  It's huge and very convoluted, but if you open RTF in
a Text editor, you can see exactly what's in there.

Note that a Word document can include a large number of binary objects such
as graphics.

A couple of caveats:

1)  If there's anything much wrong with the document, Automation cannot read
it either.  Automation depends upon the internal collections in the document
object model being present and intact.  If they're not, the object model
collapses.

2)  Even if you get the data out, it becomes a huge job to try to analyse
what's wrong.  Many of the problems you get in a Word document are due to
excessive levels of abstraction overflowing internal buffers.  The code may
be "legal", but it becomes so complex that Word runs out of memory trying to
read it.

XML gives you your best shot: if you get the document out to XML, you can
read and correct most things if you know WordML very well.  Regrettably, if
there's much wrong with the document, the XML output filter will fail to
complete the save.

Sorry!

On 17/8/06 8:56 AM, in article #yBXlcYwGHA.2204@TK2MSFTNGP03.phx.gbl,

> Folks:
>
[quoted text clipped - 7 lines]
>
> Graham

Signature

Please reply to the newsgroup to maintain the thread.  Please do not email
me unless I ask you to.

John McGhie <john@mcghie.name>
Microsoft MVP, Word and Word for Macintosh.  Consultant Technical Writer
Sydney, Australia +61 (0) 4 1209 1410


Rate this thread:






 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.