Home > Backend Development > Python Tutorial > How to Remove HTML Tags from Strings in Python?

How to Remove HTML Tags from Strings in Python?

Linda Hamilton
Release: 2024-12-04 01:00:11
Original
606 people have browsed it

How to Remove HTML Tags from Strings in Python?

Removing HTML Formatting from Strings in Python

Consider the task of extracting the contents of an HTML document without displaying the formatting tags. For instance, the HTML element some text should output only "some text," and hello should display "hello."

Solution

The built-in Python library provides a useful mechanism to achieve this goal:

For Python 3:

from io import StringIO
from html.parser import HTMLParser

class MLStripper(HTMLParser):
    def __init__(self):
        super().__init__()
        self.reset()
        self.strict = False
        self.convert_charrefs= True
        self.text = StringIO()
    def handle_data(self, d):
        self.text.write(d)
    def get_data(self):
        return self.text.getvalue()

def strip_tags(html):
    s = MLStripper()
    s.feed(html)
    return s.get_data()
Copy after login

For Python 2:

from HTMLParser import HTMLParser
from StringIO import StringIO

class MLStripper(HTMLParser):
    def __init__(self):
        self.reset()
        self.text = StringIO()
    def handle_data(self, d):
        self.text.write(d)
    def get_data(self):
        return self.text.getvalue()

def strip_tags(html):
    s = MLStripper()
    s.feed(html)
    return s.get_data()
Copy after login

The above is the detailed content of How to Remove HTML Tags from Strings in Python?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template