I have an html file that is processed using a bash script and want to delete empty tables. The file is generated from the sql statement but includes headers when no record is found. I want to delete the title for which no record was found.
<table border="1"> <caption>Table with data</caption> <tr> <th align="center">type</th> <th align="center">column1</th> <th align="center">column2</th> <th align="center">column3</th> <th align="center">column4</th> </tr> Data rows exists here </table> <table border="1"> <caption>Empty Table To Remove</caption> <tr> <th align="center">type</th> <th align="center">column1</th> <th align="center">column2</th> <th align="center">column3</th> <th align="center">column4</th> <th align="center">column5</th> <th align="center">column6</th> <th align="center">column7</th> </tr> </table> <table border="1"> <caption>Table with data</caption> <tr> <th align="center">type</th> <th align="center">column1</th> <th align="center">column2</th> <th align="center">column3</th> <th align="center">column4</th> </tr> Data rows exists here </table>
I tried using a combination of grep and sed to delete the empty table. I am able to accomplish this task when the tables contain the same number of columns. I'm having some problems now because my tables have different number of columns.
When the tables have the same number of columns, I am able to loop based on the headers, count, and then delete. Since the number of columns is different, this doesn't work.
Like this, use xmlstarlet and xpath:
To edit in a location such as
sed -i
, useNo explanation, but do not use
sed
orregex
to parseHTML/XML