Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
DiscussionsAccessExcelInfoPathOutlookPowerPointPublisherWord
DirectoryUser Groups
Related Topics
Outlook ExpressInternet ExplorerWindowsMS Server ProductsMore Topics ...

MS Office Forum / Word / Programming / August 2006

Tip: Looking for answers? Try searching our database.

RegEx Range.Text object corrupting embedded footnotes

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
jalanford - 08 Aug 2006 14:03 GMT
I have a macro which processes text using Regular Expressions (to find run-in
heads in paragraphs). It seems that regular expressions will only work on the
Range.Text object and not Range.FormattedText. I thought I had most of the
kinks worked out of this but found a new wrinkle yesterday. It seems that, if
the text of a paragraph is altered by a RegEx, then the embedded footnotes in
that paragraph are broken. I have a sense that this is because the proper
information isn't be passed back to the document by the Range.Text object
when a replacement is made (duh). Before I sit down and write a macro that
cuts out the embedded footnotes and replaces them with a marker before the
RegEx is run and then reverses the replacement after the RegEx is run, I
thought I would check with the experts and see if there are any better (and
easier) approaches out there.

I have tried replacing the RegEx searches with S/R using Word's wildcard
search options but can't achieve the desired search complexity with them. I
always run into the "expression is to complex" error. As a side question, are
the limitations of Word's Wildcard Search listed anywhere. The one major one
I've encountered is the limitation to 7 groups instead of the regular 9.
There seem to be others but I can't recall them right now.

Anyway, any suggestions would be appreciated.

Thanks,

Jeff
Russ - 09 Aug 2006 09:33 GMT
Jeff,
Could you give an example of your too complex Word find and replace
expressions?
Word's Find and Replace can also search for formatting at the same time as
literal text and/or wildcards. You may be trying to jam all your criteria
into one search when multiple separate searches would work as well within
the limits of Word. I agree that Regular Expressions are generally more
powerful than Word's wildcards.

> I have a macro which processes text using Regular Expressions (to find run-in
> heads in paragraphs). It seems that regular expressions will only work on the
[quoted text clipped - 21 lines]
>
> Jeff

Signature

Russ

drsmN0SPAMikleAThotmailD0Tcom.INVALID

jalanford - 09 Aug 2006 14:11 GMT
Russ,

Because of other issues with formatting being lost or changed (we're trying
to apply a new paragraph level style template to pre-existing documents--Yes,
I know that is rife with problems, but we don't create the documents here),
the formatting is converted to tagging (text) near the beginning of the
processing process and removed. This pretty much gives us a flat document to
work with. So, since the formatting is present in the document only as tags,
Word's wildcard search falls short because of the necessary tag capturing
possibilies and the need to use those tags as decision points. The macro with
performs the particular RegEx search in question is just one in a suite of
over seventy macros that get run as a part of this process.

Here's an example of the kind of RegEx I'm talking about:

re.Pattern = "^([<][-italcbodsunerp/<>3]+[>])
?([^.<]+)([.])([^.<]+)([.]?[<][-italcbodsunerp/<>]+[>][\u2013\u2014\u2015]+|[.]?[\u2013\u2014\u2015]+[<][-italcbodsunerp/<>]+[>]|[<][-italcbodsunerp/<>]+[>][.]?[\u2013\u2014\u2015]+|[.]?[<][-italcbodsunerp/<>]+[>]—|[.]?—[<][-italcbodsunerp/<>]+[>]|[<][-italcbodsunerp/<>]+[>][.]?—|[.]?[<][-italcbodsunerp/<>]+[>]
[-] |[.]? [<][-italcbodsunerp/<>]+[>][-] |[.]? [-][<][-italcbodsunerp/<>]+[>]
)([<a-zA-Z0-9])"

Actually, this is a fairly simple one as it is not using nested,
non-capturing groups, look ahead or look back (none of which are possible
using Word's Wildcard S/R). This is only one of several RegEx that are run to
do the same thing. So, I'm not really trying to cram everything into one
expression ;), just into 5 to 10 :)

Converting the above to multilple wild card Word searchs (which I use
whenever possible--I guess "possible" is what is in question here ;)  ) just
doesn't seem feasible to me. While, in this case, it could probably be done;
it seems to me that it would be faster to write macros which convert the
footnotes to flat text and then re-embed them after the RegEx has run. (I've
already solved the symbol conversion problem.)

Thanks,

Jeff

> Jeff,
> Could you give an example of your too complex Word find and replace
[quoted text clipped - 30 lines]
> >
> > Jeff
jalanford - 10 Aug 2006 13:28 GMT
Nevermind. Got the Footnote Macros up and running. Extracting the footnotes
with markup is tricky. :)

Jeff

> Russ,
>
[quoted text clipped - 67 lines]
> > >
> > > Jeff
Russ - 11 Aug 2006 16:30 GMT
Jeff,
A better specialty filtering tool for pre-existing markup-language files
might be http://www.powergrep.com/ which puts a Windows interface on the
unix grep command and uses the power of regular expressions.

> Nevermind. Got the Footnote Macros up and running. Extracting the footnotes
> with markup is tricky. :)
[quoted text clipped - 82 lines]
>>>>
>>>> Jeff

Signature

Russ

drsmN0SPAMikleAThotmailD0Tcom.INVALID

 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.