Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
DiscussionsAccessExcelInfoPathOutlookPowerPointPublisherWord
DirectoryUser Groups
Related Topics
Outlook ExpressInternet ExplorerWindowsMS Server ProductsMore Topics ...

MS Office Forum / Word / Programming / November 2007

Tip: Looking for answers? Try searching our database.

What is the best way to read Word text looking for patterns?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Chrisso - 27 Nov 2007 19:39 GMT
Hi All

I have a file system full of Word documents.. All the documents have
identifiers that follow a proscribed form :
               CJS/<number>/<number>
               e.g. CJS/10/1023 and CJS/10/2023

I need to find which documents reference other documents in their
text. I am an intermediate Excel VB programmer and hope to do this in
Word VB with the following pseudo-code for a single document:
   * open the document
   * read the text in some fashion in some sort of chunk per
iteration
   * scan this text for text that looks like the above described
document number pattern
   * compile the list as I go and output to Excel

Is this all possible in Word VB? It seems to me the trickest part
would be to work out how to read the text in......

Does anyone have any hints or ideas to get me started on this one? I
have the code to open a Word file but that is it at this time. What is
the best way to read the text from a Word document into a VB program?

Thanks for any ideas.
Chrisso
fumei - 27 Nov 2007 20:48 GMT
CJS/10/1023 - is this a document filename?  is that what you mean by
"identifiers"?  Not quite following.

As for searching the text content of a Word document, there are a number of
options.  Once the document is open:

Dim r As Range
Set r = ActiveDocument.Range
With r.Find
  .ClearFormatting
  Do While .Execute(FindText:="CJS", Forward:=True) = True
       r.MoveEnd Unit:=wdcharacter, Count:= 8
       ' do whatever
       r.Collapse Direction:=wdCollapseEnd
 Loop
End With

This would start at the start of the document, find "CJS", expand the range 8
characters to the right

eg.  find "CJS", then make it CJS + 8 more = "CJS/10/1023"

do whatever you want to do with that instance

collapse to the end of "CJS/10/1023", and then go on to the next "CJS".

Other than that, you will need to ask more specific questions.

>Hi All
>
[quoted text clipped - 22 lines]
>Thanks for any ideas.
>Chrisso
fumei - 28 Nov 2007 19:24 GMT
By the way, please do not double post.  Thanks.

Rate this thread:






 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.