MS Office Forum / Word / Programming / March 2007
Deleting common words?
|
|
Thread rating:  |
StargateFanFromWork - 26 Mar 2007 18:40 GMT I don't know how else to do this, so thought I'd ask. To delete common words from word lists that I'm creating, is this the best approach?:
******************************************************************************** Sub DeleteCommonWords()
With ActiveDocument.Range.Find .Execute FindText:="^phe^p", ReplaceWith:=" ", Replace:=wdReplaceAll End With
End Sub ********************************************************************************
This has only one example, of course, but before I start adding common words to the list of things to delete, thought I'd run it by the group. Perhaps there's an infinitely better way to do this?
TIA. :oD
Greg Maxey - 26 Mar 2007 19:11 GMT I suppose I would use an array and a For ... Next statement:
Sub ScratchMacro() Dim myArray() As String Dim i As Long Dim oRng As Word.Range myArray = Split("a, an, for, of, or, can, but, the", ",") Set oRng = ActiveDocument.Range For i = 0 To UBound(myArray) With oRng.Find .Text = myArray(i) .Replacement.Text = "" .MatchWholeWord = True .Execute Replace:=wdReplaceAll End With Next i End Sub
> I don't know how else to do this, so thought I'd ask. To delete common > words from word lists that I'm creating, is this the best approach?: [quoted text clipped - 14 lines] > > TIA. :oD Greg Maxey - 26 Mar 2007 19:26 GMT On second thought I would use my VBA find and replace addin: http://gregmaxey.mvps.org/VBA_Find_And_Replace.htm
> I don't know how else to do this, so thought I'd ask. To delete common > words from word lists that I'm creating, is this the best approach?: [quoted text clipped - 14 lines] > > TIA. :oD StargateFanFromWork - 26 Mar 2007 21:32 GMT On second thought I would use my VBA find and replace addin: http://gregmaxey.mvps.org/VBA_Find_And_Replace.htm
This one seems manual to me, correct? It seems more like a GUI to enhance the search and replace box that comes natively with Word, if I'm not mistaken.
Actually, I've continued working on the script and have quite a list of words entered. As long as it's okay to use the syntax below, or whatever, I think that this is best. It's for text that I bring into Word to make wordlists out of. Granted, it isn't something I'll do more than once or twice a year once I have my main lists completed, but if it's always ready to go, it'll be an enormous time-saver and I won't have to type in the words to replace, the list is always ready to be used.
I have general dictionary word lists that already have common words in them, so don't need to keep those in future, themed or specific word lists. So they can just be automatically removed.
Since this is a macro that is loaded after cleaning up the text into a single column of single words, that's why there are so many ^p symbols below. A whole bunch of carriage returns were getting left behind <g>:
.Execute FindText:="^pa^p", ReplaceWith:="^p", Replace:=wdReplaceAll .Execute FindText:="^pan^p", ReplaceWith:="^p", Replace:=wdReplaceAll .Execute FindText:="^pand^p", ReplaceWith:="^p", Replace:=wdReplaceAll .Execute FindText:="^pask^p", ReplaceWith:="^p", Replace:=wdReplaceAll .Execute FindText:="^pat^p", ReplaceWith:="^p", Replace:=wdReplaceAll .Execute FindText:="^pbut^p", ReplaceWith:="^p", Replace:=wdReplaceAll .Execute FindText:="^pby^p", ReplaceWith:="^p", Replace:=wdReplaceAll
Amazing how well this works. It takes time, of course, but it's been deleting basic unncessary words on a list that started out with 1500 "pages" and has brought it down to 777 "pages" with very little efford on my part. Each time I'll scroll through to see what other words could be added to the macro and then run the macro through again. It takes it only a few moments to go through the list. Much quicker than I've been doing it, at any rate <lol>. Then the words that are left can be gone through manually.
Thanks. :oD
On Mar 26, 1:40 pm, "StargateFanFromWork" <NoS...@NoJunk.com> wrote:
> I don't know how else to do this, so thought I'd ask. To delete common > words from word lists that I'm creating, is this the best approach?: [quoted text clipped - 16 lines] > > TIA. :oD macropod - 27 Mar 2007 07:11 GMT Hi StargateFanFromWork,
If you take the macro Graham posted, you can soon modify it to suit your needs. Plus, it's much simpler to maintain than what you've now got. For example, the following uses your Find/Replace structure with Graham's code and the 500 most common words in the english language (per http://www.world-english.org/english500.htm), in order of fequency:
Sub ScratchMacro() Dim myArray() As String Dim i As Long Dim oRng As Word.Range myArray = Split("the,of,to,and,a,in,is,it,you,that,he,was,for,on,are,with,as,I,his,they,be,at,one,have,this," & _ "from,or,had,by,hot,but,some,what,there,we,can,out,other,were,all,your,when,up,use,word,how,said,an,each,she," & _ "which,do,their,time,if,will,way,about,many,then,them,would,write,like,so,these,her,long,make,thing,see,him,two," & _ "has,look,more,day,could,go,come,did,my,sound,no,most,number,who,over,know,water,than,call,first,people,may," & _ "down,side,been,now,find,any,new,work,part,take,get,place,made,live,where,after,back,little,only,round,man,year," & _ "came,Show,every,good,Me,give,our,under,Name,very,through,just,form,much,great,think,say,Help,low,Line,Before," & _ "turn,cause,same,mean,differ,move,right,boy,old,too,does,tell,sentence,set,three,want,air,well,also,play,small," & _ "end,put,home,read,hand,port,large,spell,add,even,land,here,must,big,high,such,follow,act,why,ask,men,change," & _ "went,light,kind,off,need,house,picture,try,us,again,animal,point,mother,world,near,build,self,earth,father," & _ "head,stand,own,page,should,country,found,answer,school,grow,study,still,learn,plant,cover,food,sun,four,thought," & _ "let,keep,eye,never,last,door,between,city,tree,cross,since,hard,Start,might,story,saw,far,sea,draw,Left,late," & _ "Run,don't,while,press,close,night,real,life,few,stop,open,seem,together,next,white,children,begin,got,walk," & _ "example,ease,paper,often,always,music,those,both,mark,book,letter,until,mile,river,car,feet,care,second,group," & _ "carry,took,rain,eat,room,friend,began,idea,fish,mountain,north,once,base,hear,horse,cut,sure,watch,color,face," & _ "wood,main,enough,plain,girl,usual,young,ready,above,ever,red,list,though,feel,talk,bird,soon,body,dog,family," & _ "direct,pose,leave,song,measure,state,product,black,short,numeral,class,wind,question,happen,complete,ship,area," & _ "half,rock,order,fire,south,problem,piece,told,knew,pass,farm,Top,whole,king,Size,heard,best,Hour,better,True," & _ "during,hundred,am,remember,step,early,hold,west,ground,interest,reach,fast,five,sing,listen,six,Table,travel,less," & _ "morning,ten,simple,several,vowel,toward,war,lay,against,Pattern,slow,center,love,person,money,serve,appear,road," & _ "map,science,rule,govern,pull,cold,notice,voice,fall,power,town,fine,certain,fly,unit,lead,cry,dark,machine,note," & _ "wait,plan,figure,star,box,noun,field,rest,correct,able,pound,done,beauty,drive,stood,contain,front,teach,week,final," & _ "gave,green,oh,quick,develop,sleep,warm,free,minute,strong,special,mind,behind,clear,tail,produce,fact,street,inch," & _ "lot,nothing,course,stay,wheel,full,force,blue,object,decide,surface,deep,moon,island,foot,yet,busy,test,record," & _ "boat,common,gold,possible,plane,age,dry,wonder,laugh,thousand,ago,ran,check,game,shape,yes,hot,miss,brought,heat," & _ "snow,bed,bring,sit,perhaps,fill,east,weight,language,among", ",") Set oRng = ActiveDocument.Range For i = 0 To UBound(myArray) With oRng.Find .Text = "^p" & myArray(i) & "^p" .Replacement.Text = "^p" .Execute Replace:=wdReplaceAll End With Next i End Sub
If you want to add more words, simply insert them with a preceding ',' into the array after 'among'.
See http://www.paulnoll.com/Books/Clear-English/English-3000-common-words.html if you want to extend the array to 3000 'american english' words.
Cheers
 Signature macropod [MVP - Microsoft Word] -------------------------
> On second thought I would use my VBA find and replace addin: > http://gregmaxey.mvps.org/VBA_Find_And_Replace.htm [quoted text clipped - 47 lines] >> >> TIA. :oD Graham Mayor - 27 Mar 2007 07:55 GMT While 'Greg' is often used as an abbreviation for 'Graham' - in this case it isn't ;)
 Signature <>>< ><<> ><<> <>>< ><<> <>>< <>><<> Graham Mayor - Word MVP
My web site www.gmayor.com Word MVP web site http://word.mvps.org <>>< ><<> ><<> <>>< ><<> <>>< <>><<>
> Hi StargateFanFromWork, > [quoted text clipped - 133 lines] >>> >>> TIA. :oD macropod - 27 Mar 2007 08:02 GMT Aargh! sorry guys.
 Signature macropod [MVP - Microsoft Word] -------------------------
> While 'Greg' is often used as an abbreviation for 'Graham' - in this case it isn't ;) > [quoted text clipped - 129 lines] >>>> >>>> TIA. :oD StargateFan - 27 Mar 2007 11:01 GMT >Hi StargateFanFromWork, > [quoted text clipped - 47 lines] > >Cheers Thank you! That is marvellous. I will most definitely try it out. That's why I posted the question. The search and replace I just learned didn't seem the most efficient way to do what I'm trying to do with the clean-up macro, but it was all I knew how to do. Will give it a whirl.
Cheers. :oD
|
|
|