Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
DiscussionsAccessExcelInfoPathOutlookPowerPointPublisherWord
DirectoryUser Groups
Related Topics
Outlook ExpressInternet ExplorerWindowsMS Server ProductsMore Topics ...

MS Office Forum / Excel / New Users / June 2005

Tip: Looking for answers? Try searching our database.

Strip or Convert HTML

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Djmask - 17 Jun 2005 16:53 GMT
I need to convert or remove html tags in excel fields. I found a
suggestion here to copy the data into notepad and save as .html. This
works great however is not practical as I numerous files. Could this
possible done with VBA? Thanks for any help. I have been searching the
internet for days now trying to figure this out.

scott

Signature

Djmask

Bob Phillips - 17 Jun 2005 17:30 GMT
This might help.

Run it a few times to pick up any strays

Dim cell As Range
Dim ipos As Long
Dim iEnd As Long

   For Each cell In Selection
       ipos = InStr(1, cell.Value, "<")
       If ipos > 0 Then
           For iEnd = ipos + 1 To Len(cell.Value)
               If Mid(cell.Value, iEnd, 1) = ">" Then
                   Exit For
               End If
           Next iEnd
           If ipos > 1 Then
               cell.Value = Left(cell.Value, ipos - 1) & Right(cell.Value,
Len(cell.Value) - iEnd)
           Else
               cell.Value = Right(cell.Value, Len(cell.Value) - iEnd)
           End If
       End If
   Next cell

Signature

HTH

Bob Phillips

> I need to convert or remove html tags in excel fields. I found a
> suggestion here to copy the data into notepad and save as .html. This
[quoted text clipped - 3 lines]
>
> scott
David McRitchie - 19 Jun 2005 12:05 GMT
Hi Scott,
after running Bob's macro you might also want to run
(or incorporate macro into Bob's) the  TRIMALL macro
  http://www.mvps.org/dmcritchie/excel/join.htm#trimall

and also include replacements for  &amp  as   &
and     &nbsp;   as a space

Curious how you ended up with HTML code within Excel.
---
HTH,
David McRitchie, Microsoft MVP - Excel    [site changed  Nov. 2001]
My Excel Pages:  http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page:        http://www.mvps.org/dmcritchie/excel/search.htm

> This might help.
>
[quoted text clipped - 31 lines]
> http://www.excelforum.com/member.php?action=getinfo&userid=24414
> > View this thread: http://www.excelforum.com/showthread.php?threadid=380124
Bob Phillips - 19 Jun 2005 14:12 GMT
It is odd isn't it? I assumed he had imported it as a text file.

Bob

> Hi Scott,
> after running Bob's macro you might also want to run
[quoted text clipped - 48 lines]
> > http://www.excelforum.com/member.php?action=getinfo&userid=24414
> > > View this thread: http://www.excelforum.com/showthread.php?threadid=380124
Scottedge - 30 Jun 2005 14:50 GMT
Thanks for the help guys. That worked unfortunelty it didn't get me th
result I needed. We are uploading an exported product list from anothe
website. The text they gave us has HTML Tags in it however the tex
field we are uploading to on our website doesn't support HTML tags. It
a big pain and the people giving us the file aren't to helpful nor ar
the people who manage our webiste. Thanks for all the help though.
learned some stuff about VBA! Looks like i just have to find replace i
notepad. The biggest problem is that there are bullets in the HTML cod
and no punctuation for those sentences. So when I remove the tags th
text becomes unreadable.

Scot
David McRitchie - 30 Jun 2005 16:27 GMT
Hi Scott,
Perhaps you could post a small sample of  say 5 rows x 25 characters
of what you started with in the way of source HTML.   What you see in
Excel and what you wanted in Excel.

If you were dealing strictly with source HTML and not Excel the following
would fail  if you tried to strip everything between  < >  on a line by line
basis.
    see <a href="#abc" title="def ghi
    jkl mno">  fails since  < and > are not matched on same line, you get bad results.

In the original question wouldn't you save as  text.

If you want to extract some wording from HTML document from a
browser,  You could use PureText  to remove all formatting and just
have text -- same result of pasting to notepad, and then copying from
notepad.

PureText Home Page, strips all formatting out of the clipboard so you can copy from and HTML display and paste plain text.  Invoke
with Window+V shortcut or with [PT] on taskbar next to clock.
  http://stevemiller.net/puretext/

If you want do edit the HTML source  HTML-Kit with Tidy provides a means of
stripping out unwanted HTML  if you get into a bit more than I have. .
---
David McRitchie, Microsoft MVP - Excel    [site changed  Nov. 2001]

> Thanks for the help guys. That worked unfortunelty it didn't get me the
> result I needed. We are uploading an exported product list from another
[quoted text clipped - 8 lines]
>
> Scott
David McRitchie - 19 Jun 2005 12:15 GMT
Hi Scott,
after running Bob's macro you might also want to run
(or incorporate macro into Bob's) the  TRIMALL macro
  http://www.mvps.org/dmcritchie/excel/join.htm#trimall

to see if I missed something but

actually you might want to include this within Bob's
remember VBA is case sensitive
  Selection.Replace What:=Chr(160), Replacement:=Chr(32), _
    LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False   Selection.Replace What:="&nbsp;"), Replacement:=Chr(32), _
    LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
  Selection.Replace What:="&amp;"), Replacement:="&", _
    LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
  Selection.Replace What:="<br>", Replacement:=Chr(10), _
    LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False

Curious how you ended up with HTML code within Excel.
---
HTH,
David McRitchie, Microsoft MVP - Excel    [site changed  Nov. 2001]
My Excel Pages:  http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page:        http://www.mvps.org/dmcritchie/excel/search.htm

> This might help.
>
[quoted text clipped - 31 lines]
> http://www.excelforum.com/member.php?action=getinfo&userid=24414
> > View this thread: http://www.excelforum.com/showthread.php?threadid=380124
Scottedge - 30 Jun 2005 17:19 GMT
Thanks for your help david. The Html is all pretty redundant. So
actually just modified the VBA code you and Bob gave to search an
replace certain html tags. I would have originally just done that wit
find and replace but there are so many records and characters that F/
would error out. I've tried lots of different HTML strippers an
converters. The problem is none of them seem to be friendly with exce
so once I convert the text the way I want it it screws up my fields
But I'm all set now. I learned a bunch about VBA today. And I'm trul
grateful for your help.

Scott
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.