You will get close with this:
http://gregmaxey.mvps.org/Word_Frequency.htm
I haven't tried, but you might be able to revise the code that deals with
"singleword" to expand the scope of singleword passed the "_" character.

Signature
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Greg Maxey - Word MVP
My web site http://gregmaxey.mvps.org
Word MVP web site http://word.mvps.org
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Hi,
>
[quoted text clipped - 15 lines]
>
> Raj
Playing with the scripting dictionary recently, I've written such a macro...
You need to check the "MIcrosoft Scripting Runtime" in "Tools > References".
Regards,
Klaus
Dim sText As String
sText = ActiveDocument.Content.Text
Dim dicWords As New Scripting.Dictionary
Dim vPlaces(0)
Dim vItem As Variant
Dim i As Long
Dim sWord As String, sChar As String
sWord = ""
For i = 1 To Len(sText)
sChar = MID(sText, i, 1)
Select Case sChar
' delimiters:
Case ChrW(9), ChrW(10), ChrW(13), ChrW(32), ChrW(34), ".", ",", _
"<", ">", "/", ";", _
"(", ")", "[", "]", _
":", "#", "+", "*", _
"?", "!"
If Len(sWord) <> 0 Then
If dicWords.Exists(key:=sWord) Then
vItem = dicWords(key:=sWord)
ReDim Preserve vItem(UBound(vItem) + 1)
vItem(UBound(vItem)) = i
dicWords(key:=sWord) = vItem
Else
vPlaces(0) = i
dicWords.Add sWord, vPlaces
End If
sWord = ""
End If
Case Else
' Or, if you want to define what characters can appear in words explicitly
' (and ignore the rest):
' Case "a" To "z", "@", "~", _
' "$", "%", "§", "&", "A" To "Z", _
' "0" To "9", "'", "-", "_"
sWord = sWord & sChar
End Select
Next i
Dim vWord As Variant
For Each vWord In dicWords.Keys
Debug.Print vWord, UBound(dicWords.Item(key:=vWord)) + 1
Next vWord
Klaus Linke - 06 Mar 2008 20:21 GMT
BTW, the output does not show the real power of using the scripting
dictionary.
I wrote it that way to store the position (of the last character) with the
word.
The plan was to speed up look-up of words in a large file.
This is demonstrated if you replace the output section at the end with
Dim vWord As Variant
For Each vWord In dicWords.Keys
Debug.Print vWord, UBound(dicWords.Item(key:=vWord)) + 1,
For i = LBound(dicWords.Item(key:=vWord)) To
UBound(dicWords.Item(key:=vWord))
Debug.Print dicWords.Item(vWord)(i) - Len(vWord);
Debug.Print "/";
Next i
Debug.Print
Next vWord
The output:
Word_this_and 1 1 /
that 1 15 /
KK857 1 20 /
9269875 1 26 /
King 2 35 / 46 /
king 1 40 /
Klaus
Greg Maxey - 08 Mar 2008 14:21 GMT
Klaus,
Nice job!
I don't understand this part:
> ' Or, if you want to define what characters can appear in words
> explicitly ' (and ignore the rest):
> ' Case "a" To "z", "@", "~", _
> ' "$", "%", "§", "&", "A" To "Z", _
> ' "0" To "9", "'", "-", "_"
Can you post (or send me e-mail) showing exactly how the code would look
using this option and an example. Thanks.

Signature
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Greg Maxey - Word MVP
My web site http://gregmaxey.mvps.org
Word MVP web site http://word.mvps.org
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Playing with the scripting dictionary recently, I've written such a
> macro...
[quoted text clipped - 45 lines]
> Debug.Print vWord, UBound(dicWords.Item(key:=vWord)) + 1
> Next vWord
Raj - 10 Mar 2008 05:20 GMT
> Klaus,
>
[quoted text clipped - 70 lines]
>
> - Show quoted text -
Hi Klaus,
It worked. Thanks Klaus.
Regards,
Raj
Klaus Linke - 10 Mar 2008 16:55 GMT
> Nice job!
Oy, thanks!!
> I don't understand this part:
>> ' Or, if you want to define what characters can appear in words
>> explicitly ' (and ignore the rest):
>> ' Case "a" To "z", "@", "~", _
>> ' "$", "%", "§", "&", "A" To "Z", _
>> ' "0" To "9", "'", "-", "_"
You would replace the "Case Else" condition with this condition.
Only the characters listed would make it into "words".
All other characters would simply be removed (... nothing is done if they
are encountered).
Say if I take the original example sentence:
Word_this_and that KK857 9269875 King king King
and use
[...]
Case "a" To "z", "A" To "Z"
sWord = sWord & sChar
End Select
(notice that the numbers 0 to 9, and the underscore "_", neither appear
under the word delimiters now, nor under the word characters), I'd get
Wordthisand 1 3 /
that 1 15 /
KK 1 23 /
King 2 35 / 46 /
king 1 41 /
Regards,
Klaus
Klaus Linke - 10 Mar 2008 17:00 GMT
> Wordthisand 1 **3** /
... shows I haven't debugged this: If I ignore characters, I can't tell the
start position reliably any more.
Klaus