MS Office Forum / Word / Programming / July 2008
Parsing multiple lines (or rows) of text
|
|
Thread rating:  |
RogerM - 21 Jul 2008 19:00 GMT I'm having trouble coming to grips with how to parse a multiple line string of text from a bookmark and am hopeful that someone can provide me with assistance. I need to capture as string variables two separate text strings from the first line of text and process those two variables. Then check to see if the bookmark contains another line of text, and if it does, perform the parsing and processing routine again. And continue that process until all the lines in the bookmark have been processed.
I can parse the first line of text in the bookmark with the following code, and that provides me with the two variables (strAgency and strNumber) that I need for the next step in the routine.
strKSOR = ActiveDocument.Bookmarks("bKSOR").Range.Text strAgencyStart = InStr(strKSOR, "<") strAgencyEnd = InStr(strKSOR, ">") strNumberStart = InStr(strKSOR, "#") strNumberEnd = InStr(strKSOR, "!") strAgency = Mid(strKSOR, strKSORAgencyStart + 1, strKSORAgencyEnd - 1 - strKSORAgencyStart) strNumber = Mid(strKSOR, strKSORNumberStart + 1, strKSORNumberEnd - 1 - strKSORNumberStart)
I'm guessing that maybe the str...Start and str...End variables should be declared as Integers, rather than strings, but I don't know if that makes a difference or not. I've tried using vbCrLf as the End of string label for the first line, but that always produces an error in the procedure.
I could set up the ActiveDocument so that it the bookmark ("bKSOR") contains a single one row, two column table for each pair of data instead of simply a line of text for each pair of data. I experimented with that approach but couldn't figure out how to parse text out of a table at all, let alone multiple tables within the same bookmark.
Thanks, Roger
Doug Robbins - Word MVP - 21 Jul 2008 20:47 GMT What do you want to do with the strAgency and strNumber when you get them? Also I am assuming that there are multiple Agencies and numbers within the range of the bookmark.
I would take a different approach, using the Find function.
 Signature Hope this helps.
Please reply to the newsgroup unless you wish to avail yourself of my services on a paid consulting basis.
Doug Robbins - Word MVP
> I'm having trouble coming to grips with how to parse a multiple line > string of text from a bookmark and am hopeful that someone can provide me [quoted text clipped - 31 lines] > Thanks, > Roger RogerM - 21 Jul 2008 23:17 GMT Yes, there is the potention for multiple agencies and numbers within the range of the bookmark, but there would only be one agency and one number on each line. The "<" ">" "#" and "!" are just characters that I chose to use within the bookmark to have a consistent label to search for while parsing the text. The agencies and number are pulled from a database and merged into a Word .rtf document and I can have some control over how the data is presented in the Word document. The bookmark might also be empty accept for the <>#! characters. If the bookmark range contains multiple agencies, it would look like:
<Lansing Police Department> #08-1001!
<Brentwood County Sheriff Office> #08-278!
"What do you want to do with the strAgency and strNumber when you get them? "
If we have a copy of the report stored on our network as a pdf file, then I want to copy that report from one location to a different location. So with strAgency I check to see what the value equals and I run through a series of "If Then" statements comparing strAgency with potentially valid strings and when the code finds a match, then I set use the matching CONST to build a path to the report file. For example:
If strAgency = "Lansing Police Department" Then strKSORPath = LNPD & strNumber & "\" & strNumber & ".pdf" strKSOR_OldPath = LNPD & strNumber & ".pdf" End If
Where LNPD is declared as Const LNPD = "s:\Scan\LNPD\" I have to build two path string variables because we used to store all the reports in the agency folder but then changed to storing them in a subfolder and assign the report number as the name of the subfolder.
Then I check to make sure the bookmark contained an agency name or not, and if it's not blank, Then check to see if we have the pdf file in the source folder, and if it does exist Then check to see if the pdf file already exists in the destination folder. If the pdf file doesn't exist in the destination folder, then copy the report from the source folder to the destination folder.
If Not strAgency = "" Then If objFSO.FileExists(strKSORPath) Then If Not objFSO.FileExists(strDCFKSORPath) Then FileCopy strKSORPath, strDCFKSORPath End If Else If objFSO.FileExists(strKSOR_OldPath) Then If Not objFSO.FileExists(strDCFKSORPath) Then FileCopy strKSOR_OldPath, strDCFKSORPath End If End If End If End If
I'm sure there is probably a cleaner way to code this, but the way I've approached it seems to get the job done, provided my bookmark only contains one agency and it's associated case number.
Roger
> What do you want to do with the strAgency and strNumber when you get them? > Also I am assuming that there are multiple Agencies and numbers within the [quoted text clipped - 37 lines] >> Thanks, >> Roger Doug Robbins - Word MVP - 22 Jul 2008 11:21 GMT The following code uses the Find function to get the each instance of the agency name and the corresponding number that are located within the bookmark bKSOR. You should be able to insert the code to do the rest of your processing before the End If, making use of the Agency.Text and Number.Text (deleting the MsgBox statements which are just there at the moment to demonstrate what it does).
Dim myrange As Range Dim Agency As Range Dim Number As Range Set myrange = ActiveDocument.Bookmarks("bKSOR").Range Selection.HomeKey wdStory Selection.Find.ClearFormatting With Selection.Find Do While .Execute(Findtext:="\<*\>", Forward:=True, MatchWildcards:=True, Wrap:=wdFindStop) = True If Selection.Range.Start >= myrange.Start And Selection.Range.End <= myrange.End Then Set Agency = Selection.Range Set Number = Selection.Range.Duplicate Agency.Start = Agency.Start + 1 Agency.End = Agency.End - 1 MsgBox Agency.Text With Number .MoveEndUntil Cset:="!", Count:=wdForward .MoveStartUntil Cset:="#", Count:=wdForward .Start = .Start + 1 End With MsgBox Number.Text End If Loop End With
Having said that, I am not sure why you are not working directly with the data from the database rather than going through the .RTF document/Word route.
 Signature Hope this helps.
Please reply to the newsgroup unless you wish to avail yourself of my services on a paid consulting basis.
Doug Robbins - Word MVP
> Yes, there is the potention for multiple agencies and numbers within the > range of the bookmark, but there would only be one agency and one number [quoted text clipped - 102 lines] >>> Thanks, >>> Roger RogerM - 22 Jul 2008 14:39 GMT Thank you Doug, that does get me most of the way to my goal. I can know check the contents of Agency.Text against the list of possible matches and successfully build the filepath string to the location of the report.pdf file. However, when I try to execute the actually filecopy portion of the routine, I keep getting a compile error indicating that I'm trying to Loop without Do. Here's the code I'm trying to use:
Dim myrange As Range Dim Agency As Range Dim Number As Range Dim strKSORPath As String Dim strKSOR_OldPath As String
Const LVPD = "S:\Scan\LVPD\" Const BCSO = "S:\Scan\BCSO\" Const HPD = "S:\Scan\HPD\"
Const ForReading = 1 Const ForWriting = 2
Set objFSO = CreateObject("Scripting.FileSystemObject")
strCaseNumber = ActiveDocument.Bookmarks("bCaseFileNumber").Range.Text
Set myrange = ActiveDocument.Bookmarks("bKSOR").Range Selection.HomeKey wdStory Selection.Find.ClearFormatting With Selection.Find Do While .Execute(Findtext:="\<*\>", Forward:=True, _ MatchWildcards:=True, Wrap:=wdFindStop) = True If Selection.Range.Start >= myrange.Start And Selection.Range.End <= myrange.End Then Set Agency = Selection.Range Set Number = Selection.Range.Duplicate Agency.Start = Agency.Start + 1 Agency.End = Agency.End - 1
With Number .MoveEndUntil Cset:="!", Count:=wdForward .MoveStartUntil Cset:="#", Count:=wdForward .Start = .Start + 1 End With
strDCFKSORPath = "y:\" & strCaseNumber & "\KSOR\" & Number.Text & ".pdf"
If Agency.Text = "Lansing Police Department" Then strKSORPath = LNPD & Number.Text & "\" & Number.Text & ".pdf" strKSOR_OldPath = LNPD & Number.Text & ".pdf" End If If Agency.Text = "Brentwood County Sheriff Office" Then strKSORPath = BCSO & Number.Text & "\" & Number.Text & ".pdf" strKSOR_OldPath = BCSO & Number.Text & ".pdf" End If If Agency.Text = "Hometown Police Department" Then strKSORPath = HPD & Number.Text & "\" & Number.Text & ".pdf" strKSOR_OldPath = HPD & Number.Text & ".pdf" End If
If Not objFSO.FileExists(strDCFKSORPath) Then If objFSO.FileExists(strKSORPath) Then FileCopy strKSORPath, strDCFKSORPath Else If objFSO.FileExists(strKSOR_OldPath) Then FileCopy strKSOR_OldPath, strDCFKSORPath End If End If MsgBox strKSORPath MsgBox strKSOR_OldPath
End If Loop End With
I know it's the If Not objFSO.FileExists(strDCFKSORPath) Then If objFSO.FileExists(strKSORPath) Then FileCopy strKSORPath, strDCFKSORPath Else If objFSO.FileExists(strKSOR_OldPath) Then FileCopy strKSOR_OldPath, strDCFKSORPath End If End If
Section of the code that is causing problems. It seems like that procedure should be the last thing the code do before looping, but I don't know why it's causing a compile error.
You asked: "I am not sure why you are not working directly with the data from the database rather than going through the .RTF document/Word route."
The honest answer is, I don't know how to do so.
> The following code uses the Find function to get the each instance of the > agency name and the corresponding number that are located within the [quoted text clipped - 138 lines] >>>> Thanks, >>>> Roger Doug Robbins - Word MVP - 22 Jul 2008 22:51 GMT You had a missing End If in the following:
If Not objFSO.FileExists(strDCFKSORPath) Then If objFSO.FileExists(strKSORPath) Then FileCopy strKSORPath, strDCFKSORPath End If 'This End If was missing Else If objFSO.FileExists(strKSOR_OldPath) Then FileCopy strKSOR_OldPath, strDCFKSORPath End If End If
I am assuming that the Else belongs to the If Not objFSO.FileExists(strDCFKSORPath) Then .. End If
 Signature Hope this helps.
Please reply to the newsgroup unless you wish to avail yourself of my services on a paid consulting basis.
Doug Robbins - Word MVP
> Thank you Doug, that does get me most of the way to my goal. I can know > check the contents of Agency.Text against the list of possible matches and [quoted text clipped - 232 lines] >>>>> Thanks, >>>>> Roger RogerM - 22 Jul 2008 23:23 GMT I looked that over numerous times and never could see it, even with the code indented in the editor. Thank you.
> You had a missing End If in the following: > [quoted text clipped - 249 lines] >>>>>> Thanks, >>>>>> Roger RogerM - 29 Jul 2008 16:46 GMT Doug, I've run into a problem with the code you suggested. The first Agency Name/Number pair isn't processed, but if the bookmark contains multiple Agency/Number pairs, then all the agencies after the first one are processed. So, if the "bKSOR" contains the following: <Hometown Police Department> #08-7333!
The code won't pick up the agency/number at all. If "bKSOR" contains:
<Hometown Police Department> #08-7333!
<Hometown Police Department> #08-7344!
Then the code will process the second line and copy the .pdf file for 08-7344.
I don't believe I changed anything of significance to the code sample you supplied, but here is what I'm using to obtain the Agency.Text and Number.Text values.
Set myrange = ActiveDocument.Bookmarks("bKSOR").Range Selection.HomeKey wdStory Selection.Find.ClearFormatting With Selection.Find Do While .Execute(Findtext:="\<*\>", Forward:=True, _ MatchWildcards:=True, Wrap:=wdFindStop) = True If Selection.Range.Start >= myrange.Start And Selection.Range.End <= myrange.End Then Set Agency = Selection.Range Set Number = Selection.Range.Duplicate Agency.Start = Agency.Start + 1 Agency.End = Agency.End - 1 With Number .MoveEndUntil Cset:="!", Count:=wdForward .MoveStartUntil Cset:="#", Count:=wdForward .Start = .Start + 1 End With
>I looked that over numerous times and never could see it, even with the >code indented in the editor. Thank you. [quoted text clipped - 252 lines] >>>>>>> Thanks, >>>>>>> Roger Doug Robbins - Word MVP - 30 Jul 2008 20:30 GMT I cannot replicate that using the following code:
Dim myrange As Range Dim Agency As Range Dim Number As Range Set myrange = ActiveDocument.Bookmarks("bKSOR").Range Selection.HomeKey wdStory Selection.Find.ClearFormatting With Selection.Find Do While .Execute(Findtext:="\<*\>", Forward:=True, MatchWildcards:=True, Wrap:=wdFindStop) = True If Selection.Range.Start >= myrange.Start And Selection.Range.End <= myrange.End Then Set Agency = Selection.Range Set Number = Selection.Range.Duplicate Agency.Start = Agency.Start + 1 Agency.End = Agency.End - 1 MsgBox Agency.Text With Number .MoveEndUntil Cset:="!", Count:=wdForward .MoveStartUntil Cset:="#", Count:=wdForward .Start = .Start + 1 End With MsgBox Number.Text End If Loop End With
In a document containing among other things a bookmark "bKSOR" that contains the following:
<Hometown Police Department> #08-7333!
The message box first displays Hometown Police Department and then it displays 08-7333
If the bookmark contains
<Hometown Police Department> #08-7333!¶ <Hometown Police Department> #08-7344!
the code selects the first <Hometown Police Department> and the message box first displays Hometown Police Department and then it displays 08-7333. Then it selects the second <Hometown Police Department> and the message box first displays Hometown Police Department and then it displays 08-7344
 Signature Hope this helps.
Please reply to the newsgroup unless you wish to avail yourself of my services on a paid consulting basis.
Doug Robbins - Word MVP
> Doug, I've run into a problem with the code you suggested. The first > Agency Name/Number pair isn't processed, but if the bookmark contains [quoted text clipped - 292 lines] >>>>>>>> Thanks, >>>>>>>> Roger StevenM - 21 Jul 2008 21:12 GMT To: RogerM,
There are many different ways of doing this depending on your particular needs.
For example, if you're parsing a string such as: "xyz<Mr. X>#345!uiv" where there is a name inside <> and a number inside #! you can use the following method.
Sub Testthis() Dim strKSOR As String Dim strAgency As String Dim strNumber As String strKSOR = "xyz<Mr. X>#345!uiv" strAgency = Right(strKSOR, Len(strKSOR) - InStr(strKSOR, "<")) strAgency = Left(strAgency, InStr(strAgency, ">") - 1) strNumber = Right(strKSOR, Len(strKSOR) - InStr(strKSOR, "#")) strNumber = Left(strNumber, InStr(strNumber, "!") - 1) MsgBox strAgency & vbCr & strNumber End Sub
I'm fairly sure that if you gave us a better idea of what the string strKSOR might look like, you would find you would receive better answers to your question. Also, in the above example, I assumed that there would be something in the strKSOR string and so didn't do any checking to make sure that the string contained the characters <>!#. And in your example, your End & Start variables should be integers or (better yet) longs, as you assumed. But using the Right and Left functions as I did, I didn't need to use a Start or End variable.
Steven Craig Miller
> I'm having trouble coming to grips with how to parse a multiple line string > of text from a bookmark and am hopeful that someone can provide me with [quoted text clipped - 31 lines] > Thanks, > Roger RogerM - 21 Jul 2008 23:20 GMT Steven, Thanks for the suggestion. I'll work with that some, but I did post a more detailed explanation of what my data looks like and what I want to do with it in my reply to Doug's post.
Roger
> To: RogerM, > [quoted text clipped - 81 lines] >> Thanks, >> Roger StevenM - 22 Jul 2008 12:39 GMT To: RogerM,
First, let me say that you might have been able to simplify things if you had used only a octothorp/pound sign to seperate the name from the number. So instead of: <Lansing Police Department> #08-1001!
You could have had just: Lansing Police Department #08-1001
The rest seem redundant, but then I don't have the details of the rest of your project. Nonetheless, I have two examples for you. The first example, I simply parse the string: "<Name> #Number!"
Sub Testthis001() Dim strKSOR As String Dim strAgency As String Dim strNumber As String strKSOR = "<Name> #Number!" If Len(strKSOR) > 4 Then strAgency = Mid(strKSOR, 2, InStr(strKSOR, ">") - 2) strNumber = Right(strKSOR, Len(strKSOR) - InStr(strKSOR, "#")) strNumber = Left(strNumber, Len(strNumber) - 1) End If MsgBox strAgency & vbCr & strNumber End Sub
In the second example, I assume that you have a series of Name/Number pairs, each on a seperate line. In Word, each line is normally seperated by a paragraph mark, represented below as "vbCr". I only presented two Name/Number pairs, but the same code would work with any number of such pairs.
Sub Testthis002() Dim strKSOR As String Dim vKSOR As Variant Dim strAgency As String Dim strNumber As String Dim i As Long strKSOR = "<Name1> #Number1!" & vbCr & "<Name2> #Number2!" vKSOR = Split(strKSOR, vbCr) For i = LBound(vKSOR) To UBound(vKSOR) If Len(vKSOR(i)) > 4 Then strAgency = Mid(vKSOR(i), 2, InStr(vKSOR(i), ">") - 2) strNumber = Right(vKSOR(i), Len(vKSOR(i)) - InStr(vKSOR(i), "#")) strNumber = Left(strNumber, Len(strNumber) - 1) End If MsgBox strAgency & vbCr & strNumber Next i End Sub
Steven Craig Miller
RogerM - 22 Jul 2008 15:22 GMT Steven, when I run this code from the actual contents of my bookmark text, I get two MsgBox prompts for the last Ageny Name/Number pair. I haven't had a chance to incorporate the rest of my routine to see how it works. I don't think it will cause a problem, but why does the code process the last Agency Name/Number pair twice.
I can set up the document without the ! sign at the end. I was having trouble locating the end of each line so I used the ! sign, however, I guess before I was using vbCrLf instead of just vbCr and as a result kept getting runtime errors when testing my previous code.
> To: RogerM, > [quoted text clipped - 53 lines] > > Steven Craig Miller StevenM - 22 Jul 2008 16:15 GMT To: RogerM,
Re: why does the code process the last Agency Name/Number pair twice.
Without being able to see your code, I cannot say for sure, but I'll make a guess.
With an extra paragraph mark at the end of your strKSOR string, the split function would assume that there is an element after the paragraph mark.
vKSOR = Split(strKSOR, vbCr)
There are a number of ways of correcting this problem. For example, in the example I gave you. I should have put the line:
MsgBox strAgency & vbCr & strNumber
inside the If ... End if statements. That would be one way of correcting the problem.
But in addition, it also makes sense to remove any paragraph mark at the end of the string. Thus:
Sub Testthis002() Dim strKSOR As String Dim vKSOR As Variant Dim strAgency As String Dim strNumber As String Dim i As Long strKSOR = "<Name1> #Number1!" & vbCr & "<Name2> #Number2!" & vbCr If Right(strKSOR, 1) = vbCr Then strKSOR = Left(strKSOR, Len(strKSOR) - 1) vKSOR = Split(strKSOR, vbCr) For i = LBound(vKSOR) To UBound(vKSOR) If Len(vKSOR(i)) > 4 Then strAgency = Mid(vKSOR(i), 2, InStr(vKSOR(i), ">") - 2) strNumber = Right(vKSOR(i), Len(vKSOR(i)) - InStr(vKSOR(i), "#")) strNumber = Left(strNumber, Len(strNumber) - 1) MsgBox strAgency & vbCr & strNumber End If Next i End Sub
The line: If Right(strKSOR, 1) = vbCr Then strKSOR = Left(strKSOR, Len(strKSOR) - 1) removes any paragraph mark at the end of strKSOR, and this means you won't have one element too many in vKSOR after the split.
Also, the line: If Len(vKSOR(i)) > 4 Then Allows you to have have strings which have nothing more than "<>#!", and you can avoid processing them since they are empty of any real value.
Steven Craig Miller
> Steven, when I run this code from the actual contents of my bookmark text, I > get two MsgBox prompts for the last Ageny Name/Number pair. I haven't had a [quoted text clipped - 64 lines] > > > > Steven Craig Miller
|
|
|