MS Office Forum / Word / Web Authoring / February 2004
Converting .doc to .html
|
|
Thread rating:  |
George - 24 Dec 2003 00:53 GMT In the past I used Word97 and had no problems using the "save as" option to convert my document to a html file.
I now have Word2002 and it is easy to use the "save as web page" feature, but I've noticed a big difference.
I can use the same document and save it as html using Word97 and it creates a html file which is 26k in size. However if I create the html file with Word2002 I end up with a html file which is over 114k!!!!!
How can I configure Word2002 to NOT create those hugh html files? Obviously a file that is over 4 times larger will also take 4 times longer to load and display. I use a dialup connection and I'm sure users of my website are also on dialups so I really want the smaller html file.
I've tried using the Word2002 option to select which browser to support, but it doesn't help. All the choices I've tried still create a file 4 times larger than Word97 did.
I can't see where Word2002 can be considered better when it can't create files as compact as Word97.
Bob Buckland ?:-\) - 24 Dec 2003 03:19 GMT Hi George,
Word 97 used an update of the add-in created in Word v6(94) it was very limited in being able to reproduce a Word document as a web page. Word 2002 can produce a web page without loss of data, even when the browsers can't always display it.
You can use File=>Save As=>Web Page-Filtered, along with the settings in Tools=>Options=>General=>Web Options to produce a more compact page than that of 'regular' File=>Save as Web Page, but the file will not be as small as the Word97 one, but it will have more current capabilities including tables, color choices, font support, CSS (style sheets) etc that weren't available in Word 97 or earlier.
MS Word's sister program MS Office Frontpage is targeted for producing websites and to some extent so is MS Office Publisher. MS Word's primary target is to produce a web version of a Word .doc file with no knowledge of web sites or HTML other than file=>save As.
======= In the past I used Word97 and had no problems using the "save as" option to convert my document to a html file.
I now have Word2002 and it is easy to use the "save as web page" feature, but I've noticed a big difference.
I can use the same document and save it as html using Word97 and it creates a html file which is 26k in size. However if I create the html file with Word2002 I end up with a html file which is over 114k!!!!!
How can I configure Word2002 to NOT create those hugh html files? Obviously a file that is over 4 times larger will also take 4 times longer to load and display. I use a dialup connection and I'm sure users of my website are also on dialups so I really want the smaller html file.
I've tried using the Word2002 option to select which browser to support, but it doesn't help. All the choices I've tried still create a file 4 times larger than Word97 did.
I can't see where Word2002 can be considered better when it can't create files as compact as Word97. >>
 Signature I hope this helps you,
Bob Buckland ?:-) MS Office System Products MVP
*Courtesy is not expensive and can pay big dividends*
The Office 2003 System parts explained http://microsoft.com/uk/office/editions.asp
George - 24 Dec 2003 07:16 GMT >-----Original Message----- >Hi George, [quoted text clipped - 3 lines] >as a web page. Word 2002 can produce a web page without loss >of data, even when the browsers can't always display it. Bob,
It's great that Word2002 can do more complex pages, but that is of no value, when it is a simple document to begin with. The problem is that Word2002 creates a file that is 4 to 5 times larger (ie slower loading) than the previous Word97. Word97 might have been limited, but the limitation were not a problem.
I guess I'm stuck using Word97 to convert the documents to html as having files which are 4 or 5 times larger than they need to be is of no value.
George
Bob Buckland ?:-\) - 24 Dec 2003 09:04 GMT Hi George,
If you're using the Filtered HTML save and using the low level browser choice then the files should not be orders of magnitude larger. The Word HTML filter also produced some code that doesn't work well in new browsers and it missed the target it was aiming for, which was to produce a Web version of a MS Word document file.
====== Bob,
It's great that Word2002 can do more complex pages, but that is of no value, when it is a simple document to begin with. The problem is that Word2002 creates a file that is 4 to 5 times larger (ie slower loading) than the previous Word97. Word97 might have been limited, but the limitation were not a problem.
I guess I'm stuck using Word97 to convert the documents to html as having files which are 4 or 5 times larger than they need to be is of no value.
George >>
 Signature I hope this helps you,
Bob Buckland ?:-) MS Office System Products MVP
*Courtesy is not expensive and can pay big dividends*
The Office 2003 System parts explained http://microsoft.com/uk/office/editions.asp
lostinspace - 24 Dec 2003 11:53 GMT ----- Original Message ----- From: "Bob Buckland ?:-)" <75214.226(At Beautiful Downtown)compuserve.com> Newsgroups: microsoft.public.word.web.authoring Sent: Wednesday, December 24, 2003 4:04 AM Subject: Re: Converting .doc to .html
> Hi George, > [quoted text clipped - 7 lines] > ====== > <<"George" <> wrote in message ...
> Bob, > [quoted text clipped - 10 lines] > > George >> Hello Bob, Best of holidays to you and yours. I may be seen as a bit of a troll here, however that is neither my intent or purpose. Rather, my goal is to remove the masquerade which MOST web page designers believe that FrontPage is creating (even though I've recently abandoned FP.) When in fact, the bloat and bad code alone, is created by Word.
Any chance that MS might in the future have some very extensive pop-ups or details informing Word users that the sole purpose of allowing Word to create HTML is to retain the Word settings for both retrieval to Word and for display by Word?
Further cautioning Word/html creators that the creation may not be viewable in other non-MS browsers?
Thanks in advance.
Bob Buckland ?:-\) - 24 Dec 2003 23:49 GMT Hi Don,
Happy holidays to you as well :)
MS doesn't say that you can't produce webpages with Word so 'only purpose' would be reaching and they do talk about what you get/don't get in the help files <g> but part of the 'focus' comes from the target market probably being company intranets rather than public webpages where the size vs ease of use (save and forget it) is probably more tilted in the favor of speed.
There are also 3rd party 'strippers' for some of what Word adds and the other browser issues often come down to 'is it standards compliant' (sigh) but then folks don't like the 'layout changes' that come with the smaller code. :) Word does include a source code editor that 'usually' honors what you put in, unlike the add-ins from earlier Word versions. In addition, in packages like MS Office Excel, the ability to save an interactive spreadsheet as a web page is a big plus for folks in that realm.
Since Word 2003 also has the ability to save in XML format and to apply XSLT transforms it should be possible to roll your own HTML code possibly.
======= Hello Bob, Best of holidays to you and yours. I may be seen as a bit of a troll here, however that is neither my intent or purpose. Rather, my goal is to remove the masquerade which MOST web page designers believe that FrontPage is creating (even though I've recently abandoned FP.) When in fact, the bloat and bad code alone, is created by Word.
Any chance that MS might in the future have some very extensive pop-ups or details informing Word users that the sole purpose of allowing Word to create HTML is to retain the Word settings for both retrieval to Word and for display by Word?
Further cautioning Word/html creators that the creation may not be viewable in other non-MS browsers?
Thanks in advance. >>
 Signature I hope this helps you,
Bob Buckland ?:-) MS Office System Products MVP
*Courtesy is not expensive and can pay big dividends*
The Office 2003 System parts explained http://microsoft.com/uk/office/editions.asp
lostinspace - 25 Dec 2003 12:56 GMT ----- Original Message ----- From: "Bob Buckland ?:-)" <75214.226(At Beautiful Downtown)compuserve.com> Newsgroups: microsoft.public.word.web.authoring Sent: Wednesday, December 24, 2003 6:49 PM Subject: Re: Converting .doc to .html
> Hi Don, > [quoted text clipped - 24 lines] > ======= > <<"lostinspace" <> wrote in message ...
> Hello Bob, > Best of holidays to you and yours. [quoted text clipped - 12 lines] > > Thanks in advance. >> My Compliments Bob :-) I make a serious attempt to refrain from using my name in NG's. Perhaps I inadvertently inserted it in a thread, otherwise, you been doing some serious poking around in google groups :-) I did however provide a URL to one of my pages in an attempt to assist somebody with a tables inquiry. You might look at the What's New pages and see how I've abandoned FP.
<snip>XML format and to apply XSLT transforms<snip> From what I've seen of this, at least as related to VML, the enhancement only makes bloat and incompatability matters much worse, at least IMO.
My screename "lostinspace" is a direct result of being lost between the world's of Front Page and HTML purists. I had been using FP for some four years and started with the infamous FPE. I've never used FP in it's full capacity, omitting use of the publishing and components options. Those restrictions, however, still were not enough to eliminate the bloat caused by FP. Upon attempting to understand these issues in the webmasters and html NG's, ANY mere mention of FP use, brings the wrath of trouble. Most of the everyday participants in those NG's, who assist many others with inquiries, cringe and turn into trolls at the mere mention of FP. It took me some time to realize that the very excessive bloat these webmasters and html NG's were referring to was actually cause by Word.
You think it was by design that Microsoft intended to allow Word to damage the reputation of Front Page? Perhaps then and not today? Seeing that FP has been recently separated from the Office suite?
In any event, it has taken some hammering and abuse in the other NG's to assist lost FP users while tolerating the mild FP abuse. The atmosphere has finally lightened up a little.
As a result, I'm here to raise cane with the use of Word to create web pages. At least, attempting to clearly decipher what MS's intended market is for Word created web pages.
If as you suggest, the intended market is "Intranet" RATHER than "internet?" Than I have another question?
Why, on earth was the FP server option included? Least that could have been done was to omit either HTTP and FTP publishing options?
BTW, there are still places on the internet (at least last time I looked) where users may download for free the old Front Page Express. If used as a basically WYSIWYG_HTML editor, it is not a bad tool. The tool very similar in layout to Word97 would at least give Word web page creators that Word is NOT the proper tool. MS could possibly even limit the capabilities of a possible newer version and perhaps cornering a market in which nobody else excels? http://www.google.com/search?q="frontpage+express"+download http://www.accessfp.net/fpexpress.htm# How
Thanks in advance.
Any possibility you be adding a line to your sig? "Publishing of web pages with MS Word was only intended for intranet!"
:-))))))))) abowling1 - 14 Jan 2004 18:44 GMT I've been playing with doc to html conversion and have come across MS's solution to their own problem as described above: Office 2000 HTML Filter 2.0 http://tinyurl.com/2qbr6. This does a good portion of what Tidy HTML (open-source software endorced by the W3C to clean up web pages) would do to a regular Word exported page. I recommend using either to fix your bloat problems. Cheers, Alex
abowling1
lostinspace - 14 Jan 2004 19:44 GMT ----- Original Message ----- From: "abowling1" <> Newsgroups: microsoft.public.word.web.authoring Sent: Wednesday, January 14, 2004 1:44 PM Subject: Re: Converting .doc to .html
> I've been playing with doc to html conversion and have come across MS's > solution to their own problem as described above: Office 2000 HTML [quoted text clipped - 8 lines] > > abowling1 Alex, Your loose application of "fix" is a bit short in statement. Sorry.
The ONLY way to ELIMINATE the bloat caused by Word in creating web pages (html) is to NOT use Word for the procedure, IN ANY CAPACITY.
If you have faith in any other notion? Than you don't understand the presentation of html.
With all respect.
Dean - 25 Feb 2004 19:29 GMT This is a long-standing problem that is a thorn in the side of those of us working with Web sites. Most sites control formatting through their own style sheets. So little of Word's html code is necessary. Unfortunately, there isn't an easy way to pick and choose the html formatting in Word. We've tried some third-party solutions, but they don't work as well as Word 97 did with native Word 97 files. Looks like Microsoft didn't bother to ask users how they use the HTML coding function.
>-----Original Message----- >In the past I used Word97 and had no problems using [quoted text clipped - 22 lines] >it can't create files as compact as Word97. >.
|
|
|