MS Office Forum / Word / Programming / March 2007
Changes in Word2007 object model (Explanation of performance changes (read "degraded")
|
|
Thread rating:  |
Greg Maxey - 12 Mar 2007 01:11 GMT Fellow MVP Helmut Weber and I where comparing efficiencies of our macros for deleting duplicate paragraphs from a document.
In this discussion (and tests) I have learned that: 1) my macro is considerably faster that his using Word2003. 2) Helmut's is considerably faster than mine using Word2007. 3) Both are considerably faster using Word2003.
Assuming Word2007 is a better product than its predecessor, its seems that at the very least both would be faster in Word2007! I am interested if anyone knows of the changes to the object model that would account for degrade in performance using both methods and the flip-flop in the result with my method in Word2003 compared to Word2007.
Jezebel or Steve Hudson, I don't remember which, once explained to me how using Do ... Loop Until x Is Nothing was faster than a For Each ... Next x. The results in our tests bare that out in Word2003, but it seems the Do ... Until x Is Nothing method has taken the slow road in Word2007.
Here is a sample of results observed using 201 and 801 paragraphs. The first 200/800 paragraphs are unique. The last paragraph (201/801) is a duplicate of the 200/800th paragraph.
Word 2003 200 800 My Method 4.375 sec 68.68 sec Helmut's Method 7.48 sec 120.81 sec Word 2007 200 800 My Method 14.42 sec 9.98 sec Helmut's Method 229.56 sec 158.44 sec
If you are interested running your own comparisons, here is code for generating the test paragraphs, my code, and Helmut's code.
Sub BuildTestParagraphs() Dim oRng As Word.Range Dim i As Long Set oRng = ActiveDocument.Range oRng.Delete For i = 1 To 200 If i = 1 Then oRng.InsertAfter "The quick brown fox jumped over the lazy dog." & vbCr Else oRng.InsertAfter "The quick and extremely agile brown fox" _ & " jumped over " & i & " the lazy dogs." & vbCr End If Next i oRng.InsertAfter "The quick and extremely agile brown fox" _ & " jumped over " & i - 1 & " the lazy dogs." End Sub
Sub GregsKillDuplicates() Dim eTime As Single Dim oParRef As Paragraph Dim oParChk As Paragraph eTime = Timer Set oParRef = ActiveDocument.Range.Paragraphs(1) Set oParChk = ActiveDocument.Range.Paragraphs(2) Do Do 'An empty last paragraph may throw an error on the last loop. On Error GoTo Err_Exit If oParRef.Range = oParChk.Range Then oParChk.Range.Delete Else Set oParChk = oParChk.Next End If Loop Until oParChk Is Nothing Set oParRef = oParRef.Next On Error Resume Next Set oParChk = oParRef.Next On Error GoTo 0 Loop Until oParRef Is Nothing Err_Exit: MsgBox Timer - eTime End Sub
Sub HelmutsKillParagraphs() 'AKA Makro6x Dim t As Single t = Timer Dim prg1 As Paragraph Dim prg2 As Paragraph For Each prg1 In ActiveDocument.Range.Paragraphs For Each prg2 In ActiveDocument.Range.Paragraphs If prg1.Range.Text = prg2.Range.Text Then If prg1.Range.Start <> prg2.Range.Start Then prg2.Range.Delete End If End If Next Next MsgBox Timer - t End Sub
 Signature Greg Maxey/Word MVP See: http://gregmaxey.mvps.org/word_tips.htm For some helpful tips using Word.
Karl E. Peterson - 12 Mar 2007 22:32 GMT > Assuming Word2007 is a better product than its predecessor, its seems that > at the very least both would be faster in Word2007! Hmmmm, an odd conclusion at that, and founded on an assumption unsupported by facts in evidence.
Is there anything, other than marketing, that would lead you to either? If anything, history refutes the conclusion outright.
 Signature .NET: It's About Trust! http://vfred.mvps.org
Greg Maxey - 12 Mar 2007 22:56 GMT Karl,
That "assuming" remark was a thin veal for an increasing dislike of Word2007 ;-)
 Signature Greg Maxey/Word MVP See: http://gregmaxey.mvps.org/word_tips.htm For some helpful tips using Word.
>> Assuming Word2007 is a better product than its predecessor, its seems >> that [quoted text clipped - 5 lines] > Is there anything, other than marketing, that would lead you to either? > If anything, history refutes the conclusion outright. Karl E. Peterson - 12 Mar 2007 23:33 GMT > "Karl E. Peterson" <karl@mvps.org> wrote... >>> Assuming Word2007 is a better product than its predecessor, its seems [quoted text clipped - 5 lines] >> Is there anything, other than marketing, that would lead you to either? >> If anything, history refutes the conclusion outright.
> That "assuming" remark was a thin veal for an increasing dislike of Word2007 > ;-) Ahhhhh... Reads much better with the sarcasm duly noted. <g>
Sorry to hear of your findings, but not at all unsurprised. Take VB.NET (please! <g>) for example. Sucks hind tit in every way, when held up against what was. MSFT's answer to crappy performance? Nothing that a quick trip to "Bigger Hammer Hardware" won't solve!
 Signature .NET: It's About Trust! http://vfred.mvps.org
Tony Jollans - 19 Mar 2007 00:42 GMT Hi Greg,
Ì find the results you post odd - especially Helmut's method being faster for 800 than 200 paragraphs in 2007 - and not consistent with the statements about which is faster in each case. Some quick tests gave me different, but still dramatic, results:
Word 2003 200 paras 800 paras Greg's way 2.45 secs 42.09 secs Helmut's way 4.42 secs 76.03 secs
Word 2007 200 paras 800 paras Greg's way 214.17 secs forever and a day Helmut's way 6.53 secs 95.05 secs
There are many factors that affect performance. In this case I wonder (a complete guess, I'll grant) if hardware is significant - might there be some optimisation for dual core processors that isn't quite right? Do you have a dual core chip? Does Word 2007 or Windows Vista do any such optimisation?
I am not surprised that Word 2007 is slower than Word 2003. I have no idea what might be causing it in this particular case but with documents becoming ever more complex and the reasonable assumption that users of newer software have, in general, newer and faster machines I don't see that as a big issue, per se.
The dramatic differences do, however, suggest that something is amiss regardless of any particular issues related to this particular process and I will try and dig some more when I've caught up from being at Summit (sorry I didn't get to meet you there, btw).
 Signature Enjoy, Tony
Fellow MVP Helmut Weber and I where comparing efficiencies of our macros for deleting duplicate paragraphs from a document.
In this discussion (and tests) I have learned that: 1) my macro is considerably faster that his using Word2003. 2) Helmut's is considerably faster than mine using Word2007. 3) Both are considerably faster using Word2003.
Assuming Word2007 is a better product than its predecessor, its seems that at the very least both would be faster in Word2007! I am interested if anyone knows of the changes to the object model that would account for degrade in performance using both methods and the flip-flop in the result with my method in Word2003 compared to Word2007.
Jezebel or Steve Hudson, I don't remember which, once explained to me how using Do ... Loop Until x Is Nothing was faster than a For Each ... Next x. The results in our tests bare that out in Word2003, but it seems the Do ... Until x Is Nothing method has taken the slow road in Word2007.
Here is a sample of results observed using 201 and 801 paragraphs. The first 200/800 paragraphs are unique. The last paragraph (201/801) is a duplicate of the 200/800th paragraph.
Word 2003 200 800 My Method 4.375 sec 68.68 sec Helmut's Method 7.48 sec 120.81 sec Word 2007 200 800 My Method 14.42 sec 9.98 sec Helmut's Method 229.56 sec 158.44 sec
If you are interested running your own comparisons, here is code for generating the test paragraphs, my code, and Helmut's code.
Sub BuildTestParagraphs() Dim oRng As Word.Range Dim i As Long Set oRng = ActiveDocument.Range oRng.Delete For i = 1 To 200 If i = 1 Then oRng.InsertAfter "The quick brown fox jumped over the lazy dog." & vbCr Else oRng.InsertAfter "The quick and extremely agile brown fox" _ & " jumped over " & i & " the lazy dogs." & vbCr End If Next i oRng.InsertAfter "The quick and extremely agile brown fox" _ & " jumped over " & i - 1 & " the lazy dogs." End Sub
Sub GregsKillDuplicates() Dim eTime As Single Dim oParRef As Paragraph Dim oParChk As Paragraph eTime = Timer Set oParRef = ActiveDocument.Range.Paragraphs(1) Set oParChk = ActiveDocument.Range.Paragraphs(2) Do Do 'An empty last paragraph may throw an error on the last loop. On Error GoTo Err_Exit If oParRef.Range = oParChk.Range Then oParChk.Range.Delete Else Set oParChk = oParChk.Next End If Loop Until oParChk Is Nothing Set oParRef = oParRef.Next On Error Resume Next Set oParChk = oParRef.Next On Error GoTo 0 Loop Until oParRef Is Nothing Err_Exit: MsgBox Timer - eTime End Sub
Sub HelmutsKillParagraphs() 'AKA Makro6x Dim t As Single t = Timer Dim prg1 As Paragraph Dim prg2 As Paragraph For Each prg1 In ActiveDocument.Range.Paragraphs For Each prg2 In ActiveDocument.Range.Paragraphs If prg1.Range.Text = prg2.Range.Text Then If prg1.Range.Start <> prg2.Range.Start Then prg2.Range.Delete End If End If Next Next MsgBox Timer - t End Sub
 Signature Greg Maxey/Word MVP See: http://gregmaxey.mvps.org/word_tips.htm For some helpful tips using Word.
Greg Maxey - 19 Mar 2007 12:54 GMT Tony,
My data was skewed and your right it didn't make much sense.
Actually my results where more like this:
Word 2003 200 800 My Method 4.375 sec 68.68 sec Helmut's Method 7.48 sec 120.81 sec Word 2007 200 800 My Method 14.42 sec 229.56 sec Helmut's Method 9.8 sec 158.44 sec
The jist of the post was my method is faster in 2003, Helmut's faster in 2007 and both faster in 2003. Other than being interested in what changes caused the differences, the main question is why does it take a new and improved product longer to perform the same task.
I don't know if I have a dual processor or not. I have a Dell Dimension 8400 which when purchased 3 years ago I think it was near cutting edge. Today it is probably overdue for the dustbin.
Thanks.
On Mar 18, 7:42 pm, "Tony Jollans" <My forename at my surname dot com> wrote:
> Hi Greg, > [quoted text clipped - 129 lines] > See:http://gregmaxey.mvps.org/word_tips.htm > For some helpful tips using Word. Tony Jollans - 19 Mar 2007 13:48 GMT Thank you, Greg, it makes more sense now.
Clearly something has changed and I can't, for the moment at least, explain it. The new and improved product has new features which, inevitably, include a processor load so, as I think I said, slightly worse performance across the board does not surprise, or concern, me. The different relative performance interests me and the extreme nature of my results does concern me.
If your computer is three years old it does not have a dual processor - I don't know if it's significant.
 Signature Enjoy, Tony
Tony,
My data was skewed and your right it didn't make much sense.
Actually my results where more like this:
Word 2003 200 800 My Method 4.375 sec 68.68 sec Helmut's Method 7.48 sec 120.81 sec Word 2007 200 800 My Method 14.42 sec 229.56 sec Helmut's Method 9.8 sec 158.44 sec
The jist of the post was my method is faster in 2003, Helmut's faster in 2007 and both faster in 2003. Other than being interested in what changes caused the differences, the main question is why does it take a new and improved product longer to perform the same task.
I don't know if I have a dual processor or not. I have a Dell Dimension 8400 which when purchased 3 years ago I think it was near cutting edge. Today it is probably overdue for the dustbin.
Thanks.
On Mar 18, 7:42 pm, "Tony Jollans" <My forename at my surname dot com> wrote:
> Hi Greg, > [quoted text clipped - 144 lines] > See:http://gregmaxey.mvps.org/word_tips.htm > For some helpful tips using Word.
|
|
|