Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
DiscussionsAccessExcelInfoPathOutlookPowerPointPublisherWord
DirectoryUser Groups
Related Topics
Outlook ExpressInternet ExplorerWindowsMS Server ProductsMore Topics ...

MS Office Forum / Word / General MS Word Questions / April 2005

Tip: Looking for answers? Try searching our database.

using Find to find all the special characters in a document

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Abbey of Farfa - 14 Apr 2005 22:05 GMT
I need to make a list of all the special (European) foreign-language
characters in a 70-page document. The characters in the document are many and
varied. Does anyone know an easy way to find them all and generate such a
list?
Signature

The Monk

Steven Marzuola - 15 Apr 2005 02:27 GMT
> I need to make a list of all the special (European) foreign-language
> characters in a 70-page document. The characters in the document are many and
> varied. Does anyone know an easy way to find them all and generate such a
> list?

I'm mainly curious, why do you need it?  I write in Spanish every
day, and it has these characters that are not in English: áéíóúñü .
 I have never needed to count them.

Here's my first thought how to approach it.  There's probably
something better.

1. Use a copy of the document.  Do a wildcard search for all the
English characters, and delete them all.

2. Also delete all punctuation marks and other characters.

What's left should be what you want.  There's several ways to
eliminate the duplicates, but not as simple as the above.

Steven
Beth - 16 Apr 2005 18:59 GMT
How about doing a search and replace (with nothing) for
the 26 non-European characters (plus punctuation marks),
and then sorting.  You could make a macro to do this if
you have to do it often.  Or do you need to know the
exact locations of the letters in the document?

Can you tell me how you make the international "ch" (with
what looks like a little V above it in Czech)?  
WordPerfect has the symbol, but it won't embed in Acrobat.

Beth

>-----Original Message-----
>I need to make a list of all the special (European) foreign-language
>characters in a 70-page document. The characters in the document are many and
>varied. Does anyone know an easy way to find them all and generate such a
>list?
Klaus Linke - 19 Apr 2005 04:55 GMT
> I need to make a list of all the special (European) foreign-language
> characters in a 70-page document. The characters in the document are many and
> varied. Does anyone know an easy way to find them all and generate such a
> list?

Hi Abbey,

The macro below creates such a list, with the code, symbol, and how often it occurs.

The main aim of the macro is to create a text file with the document's text, and &#xNNNN; numeric character references instead of "upper Unicode" characters.
If you don't need or want that, you can delete (or comment out) the line
 ActiveDocument.Content.Text = strOld

Regards,
Klaus

Sub Unicode2TagsUnformatted()
 '
 ' tags special characters
 ' as &#xHHHH; in text files
 '
 ' run SymbolsUnprotect first!
 '
 ' might run into problems with large files
 ' (memory for string allocation)
 Dim Code As Long
 Dim flagReplace As Boolean
 Dim flagDecorative As Boolean
 Dim numCharsRemaining As Long
 Dim numCharsAll As Long
 Dim percentCharsProcessed As Single
 Dim strTag As String
 Dim strOutList As String
 Dim strOutChar As String
 Dim strTest As String
 Dim strOld As String
 ' decide on the kinds of text you want to include:
 With ActiveDocument.Content.TextRetrievalMode
   .IncludeFieldCodes = True
   .IncludeHiddenText = True
 End With
 StatusBar = "reading document into string..."
 strTest = ActiveDocument.Content.Text
 strOld = strTest
 numCharsAll = Len(strTest)
 Do
   ' update character count:
   numCharsRemaining = Len(strTest)
   Code = Int2Long(AscW(left(strTest, 1)))
   '   Which characters should be tagged?
   flagReplace = False
   flagDecorative = False
   Select Case Code
     Case 30 To 31
       ' (protected hyphen, optional hyphen)
       flagReplace = True
     Case 128 To 159
       ' shouldn't really appear
       flagReplace = True
       MsgBox "Unexpected!", , Code
     Case 160
       ' (if you want to tag protected spaces)
       flagReplace = True
     Case 161 To 254
       ' Don't really need to be tagged in most cases
       ' (if you transport from PC to Mac in RTF format)
       flagReplace = True
       ' Remove these characters from the test string:
       strTest = Replace(strTest, ChrW(Code), "", , , _
       vbBinaryCompare)
     Case 255 To &HFFFF&
       ' regular "upper Unicode" characters
       flagReplace = True
     Case Else
       strTest = Replace(strTest, ChrW(Code), "", _
       , , vbBinaryCompare)
'        percentCharsProcessed = _
'        100 * (numCharsAll - Len(strTest)) / numCharsAll
'        StatusBar = "tagging special chars " _
'        & Str(percentCharsProcessed) _
'        & "% completed"
   End Select
   If (flagReplace = True) Then
     ' build the tag:
     strTag = Trim(Hex(Code))
     While Len(strTag) < 4
       strTag = "0" + strTag
     Wend
     strTag = "&#x" + strTag + ";"
     strOutChar = strTag & vbTab & ChrW(Code)
     ' Replace all characters with this code:
     strOld = Replace(strOld, ChrW(Code), strTag, , , _
     vbBinaryCompare)
     ' Remove these characters from the test string:
     strTest = Replace(strTest, ChrW(Code), "", , , _
     vbBinaryCompare)
     ' For the list that's going to be added:
     strOutChar = strOutChar & vbTab _
     & str(numCharsRemaining - Len(strTest))
     strOutList = strOutList & strOutChar & vbCrLf
     percentCharsProcessed = _
     100 * (numCharsAll - numCharsRemaining) / numCharsAll
     StatusBar = "tagging special chars " _
     & str(percentCharsProcessed) & "% completed"
   End If
 Loop Until Len(strTest) = 0
 Documents.Add
 ActiveDocument.Content.Text = strOld
 ActiveDocument.Content.Style = wdStyleNormal
 Selection.EndKey unit:=wdStory
 Selection.Collapse (wdCollapseEnd)
 Selection.TypeParagraph
 Selection.TypeText ("Special characters:")
 Selection.TypeParagraph
 Selection.InsertAfter strOutList
 If Selection.Type = wdSelectionNormal Then
   Selection.Sort
 End If
End Sub
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.