Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
DiscussionsAccessExcelInfoPathOutlookPowerPointPublisherWord
DirectoryUser Groups
Related Topics
Outlook ExpressInternet ExplorerWindowsMS Server ProductsMore Topics ...

MS Office Forum / Word / Programming / January 2005

Tip: Looking for answers? Try searching our database.

Regex & Wildcards

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Vince - 06 Jan 2005 04:30 GMT
Hey,

I need to find the following by matching Wild Cards.

1.1 mol/L
1 mol/L
1mol/L
1.1 mol /L
1 mol /L
1mol /L
1.1 mol / L
1 mol / L
1mol / L
1.1 mol/ L
1 mol/ L
1mol/ L

A sentence could contain any one this. For instance "James drank a solution
of Nitrogen Peroxide with a concentration of 5.15 mol/L".
This is what I could come up with:

([0-9.]@)( @)(mol/L)

Takes care of any numerals / decimals but does not account for:
a) The space between the number and mol/L (It looks for one space or more
but there is a possibility that a space might not exist like 1.1mol/L)
b) It strictly looks for mol/L and can't account for mol / L, mol/ L or mol
/L. In order to use this, I would have to repeat each instance with
appropriate spaces!

Questions:
1) How do I write a single Wildcard match for all the possibilities listed
above?
2) How can I say "Optional" in Regex. Eg. Di[peg] could be anyone of "Dig"
"Dip" or "Die". But I need to say that "Di" may or may not be followed by
"p" "e" or "g". In Perl, I would say "(Di)([epg])*"  How do I say that in
VBA?

Thanks a lot for your time / any reponse.

Vince
Helmut Weber - 06 Jan 2005 11:19 GMT
Hi Vince,
before putting much effort into something,
that is hardly possible, as wildcard search
does not allow to search for zero or more occurences,
why not adjusting the text beforehand, like

Sub Test777()
ResetSearch
Dim rDcm As Range
Set rDcm = ActiveDocument.Range
With rDcm.Find
  .Text = "mol"
  .Replacement.Text = " mol"
  .Execute Replace:=wdReplaceAll
  .Text = "mol[ ]{1,}/"
  .Replacement.Text = "mol/"
  .MatchWildcards = True
  .Execute Replace:=wdReplaceAll
  .Text = "mol/[ ]{1,}L"
  .Replacement.Text = "mol/L"
  .MatchWildcards = True
  .Execute Replace:=wdReplaceAll
  .Text = "[ ]{1,}mol/L"
  .Replacement.Text = " mol/L"
  .MatchWildcards = True
  .Execute Replace:=wdReplaceAll
End With
End Sub
'---
Public Sub ResetSearch()
With Selection.Find
  .ClearFormatting
  .Replacement.ClearFormatting
  .Text = ""
  .Replacement.Text = ""
  .Forward = True
  .Wrap = wdFindContinue
  .Format = False
  .MatchCase = False
  .MatchWholeWord = False
  .MatchWildcards = False
  .MatchSoundsLike = False
  .MatchAllWordForms = False
  .Execute
End With
End Sub

HTH

Greetings from Bavaria, Germany
Helmut Weber, MVP
"red.sys" & chr(64) & "t-online.de"
Word XP, Win 98
http://word.mvps.org/
Vince - 07 Jan 2005 02:13 GMT
Hey Helmut,

Thanks for your response.

I wanted to save efforts by coming up with a text file that contained all
find and replace conditions. At the risk of boring you, please allow me to
explain.

Problem: I am trying to copy edit word files and part of the long list of
copy editing rules, involves separating numerals and units of the format
"numeral thin space unit". So, I copied a huge list of units from the
internet and wrote a function that reads from a text file and does the find
and replace automatically. For instance, the text file could be:

([0-9.]@)( @)(mol/L)SPLIT\1^s\3SPLITTRUESPLITTRUE ' This tells the program
to find the first part before the first split, replace it with the
([0-9.]@)( @)(m/s)SPLIT\1^s\3SPLITTRUESPLITTRUE ' part before the second
split, match wild characters and be case sensitive

Basically, I wanted this text file to be edited by the user so that they can
add their own units that I missed. But, the problem or rather, the
inconvenience is that they need to type all possibilities into the file. For
instance, the above would be:

([0-9.]@)( @)(mol/L)SPLIT\1^s\3SPLITTRUESPLITTRUE ' This tells the program
to find the first part before the first split, replace it with the
([0-9.]@)( @)(m/s)SPLIT\1^s\3SPLITTRUESPLITTRUE ' part before the second
split, match wild characters and be case sensitive
([0-9.]@)( @)(mol / L)SPLIT\1^smol/LSPLITTRUESPLITTRUE ' This tells the
program to find the first part before the first split, replace it with the
([0-9.]@)( @)(m / s)SPLIT\1^sm/sSPLITTRUESPLITTRUE ' part before the second
split, match wild characters and be case sensitive
([0-9.]@)( @)(mol /L)SPLIT\1^smol/LSPLITTRUESPLITTRUE ' This tells the
program to find the first part before the first split, replace it with the
([0-9.]@)( @)(m /s)SPLIT\1^sm/sSPLITTRUESPLITTRUE ' part before the second
split, match wild characters and be case sensitive
([0-9.]@)( @)(mol/ L)SPLIT\1^smol/LSPLITTRUESPLITTRUE ' This tells the
program to find the first part before the first split, replace it with the
([0-9.]@)( @)(m/ s)SPLIT\1^sm/sSPLITTRUESPLITTRUE ' part before the second
split, match wild characters and be case sensitive

This two units, multplies to over 8 lines! This could slow down the program
(Don't really mind that...) but the main problem is that the text file could
become a little too big in the long run. This is why I was wondering if I
could somehow accomodate the possiblities in the text file to begin with
(using some wildcard search).

What I could do, however, is to use your method so that the program (when
reading from the file) also makes rooms for the possibilites listed above.
If you have a better idea, please let me know.

Thank you for your time.

Vince

> Hi Vince,
> before putting much effort into something,
[quoted text clipped - 50 lines]
> Word XP, Win 98
> http://word.mvps.org/
Helmut Weber - 07 Jan 2005 09:52 GMT
Hi Vince,
not that I understand all, but for things like:

"mol /L", "mol/ L", "mol  / L", "mol /  L"
"m /s", "m / s", "m/ s", "m /s", "m  /  s"

a possible workaround would be
to replace first "/" by " / ", in order to
overcome the limition that there is no search for
zero ore more occurences of a character.
So we add additional characters first!
After that, each "/" would be surrounded by spaces.
And after that, the following search using wildcards
would find all occurences of [ ]{1,}/[ ]{1,} and
can be replaced by "/": Resulting in "mol/L", "m/s".

And there may be more such simple tricks.

HTH
Greetings from Bavaria, Germany
Helmut Weber, MVP
"red.sys" & chr(64) & "t-online.de"
Word 2002, Windows 2000
Vince - 07 Jan 2005 10:07 GMT
Hey Helmut,

Thanks, that's a great idea! I just have to find out if adding a space
before and after every slash in the document is acceptable (what if there's
some text that has a '/' and is not a unit). But, I don't think it should be
a problem.....

Thanks, again!

Vince

> Hi Vince,
> not that I understand all, but for things like:
[quoted text clipped - 19 lines]
> "red.sys" & chr(64) & "t-online.de"
> Word 2002, Windows 2000
Helmut Weber - 07 Jan 2005 10:50 GMT
Hi Vince,

just one more word,
depending on how big and how complex your docs
are, and on how much effort is justified,
one could even create a macro, that after
removing all spaces from slashes, highlights all
units as they are defined in a list, and locates
"/" that are not highlighted. And many more
variations.

Cheers

Helmut Weber
Vince - 07 Jan 2005 11:02 GMT
Thanks, Helmut!

Excellent idea. I am changing everything coming from the text file to Green
color. Easy to detect odd ones out like you mentioned.

Thanks, again!

Vince

> Hi Vince,
>
[quoted text clipped - 10 lines]
>
> Helmut Weber
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.