Monday, December 03, 2007

Cleaning up HTML with grep

I came across an email I sent out a few months ago to some co-workers. I thought it would be useful on the blog. I really like grep, but I haven't had the opportunity to use it as much in the last year. These are grep searches in Dreamweaver, it can vary a little from software to software (ie, Vi, Visual Studio, etc has some varying functionality).


I just had to go through a file that had tags with a varying width attribute I needed to whack, ie:
width="1"
width="23"
width="456"

With grep it was a quick search with the following (the grep stuff is in red bold):
width="[^"]*"


Then I had to whack a bunch of opening and closing paragraph tags:
<[/]*p>


Lastly, there were a slew of span tags, ie:
<span style='font-size:10.0pt;font-family:Arial'>John <span class=SpellE>Nyquist</span></span>

To whack those:
<[/]*span[^>]*>