Article Topic Learning Download Q&A Programming Dictionary Game Recent Updates

简体中文(ZH-CN) English(EN) 繁体中文(ZH-TW) 日本語(JA) 한국어(KO) Melayu(MS) Français(FR) Deutsch(DE)

Home> Backend Development> Python Tutorial> body text

Python HTMLParser模块解析html获取url实例

WBOY

Release： 2016-06-10 15:15:47

Original

1056 people have browsed it

HTMLParser是python用来解析html的模块。它可以分析出html里面的标签、数据等等，是一种处理html的简便途径。HTMLParser采用的是一种事件驱动的模式，当HTMLParser找到一个特定的标记时，它会去调用一个用户定义的函数，以此来通知程序处理。它主要的用户回调函数的命名都是以handler_开头的，都是HTMLParser的成员函数。当我们使用时，就从HTMLParser派生出新的类，然后重新定义这几个以handler_开头的函数即可。这几个函数包括：

handle_startendtag 处理开始标签和结束标签
handle_starttag 处理开始标签，比如
handle_endtag 处理结束标签，比如
handle_charref 处理特殊字符串，就是以开头的，一般是内码表示的字符
handle_entityref 处理一些特殊字符，以&开头的，比如
handle_data 处理数据，就是 data 中间的那些数据
handle_comment 处理注释
handle_decl 处理 handle_pi 处理形如的东西

这里我以从网页中获取到url为例，介绍一下。要想获取到url，肯定是要分析标签，然后取到它的href属性的值。下面是代码：

#-*- encoding: gb2312 -*- import HTMLParser class MyParser(HTMLParser.HTMLParser): def __init__(self): HTMLParser.HTMLParser.__init__(self) def handle_starttag(self, tag, attrs): # 这里重新定义了处理开始标签的函数 if tag == 'a': # 判断标签的属性 for name,value in attrs: if name == 'href': print value if __name__ == '__main__': a = 'test链接到163' my = MyParser() # 传入要分析的数据，是html的。 my.feed(a)

Copy after login

Related labels：

python

source：php.cn

Previous article：python smtplib模块发送SSL/TLS安全邮件实例 Next article：python内存管理分析

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Latest Articles by Author

5 Altcoins Under $1 to Buy Now Before They Hit the Moon

2024-08-18 00:46:11
Top 5 Promising Crypto Presales to Watch in 2024: BlockDAG, Pepe Unchained, 5th Scape, Tokero, and Artemis Coin

2024-08-18 00:42:10
MoonTaurus (MNTR) Gearing up to Be One of the Biggest Meme Coins of the 2024 Bull Run: Top Crypto Trader

2024-08-18 00:41:12
Brett and Floki See Red as Market Volatility Intensifies, RBLX Ready to Revolutionize Online Gambling

2024-08-18 00:39:10
Solana-Based Slerf Voting Now Live

2024-08-18 00:38:10
Himachal Pradesh Medical Officer Association joins doctor strike over Kolkata rape-murder case

2024-08-18 00:36:10
SHIB Could Eliminate Two More Zeros and Hit $0.00117 if It Attains Tesla's Market Cap

2024-08-18 00:35:10
Franklin Templeton Files for Crypto Index ETF That Would Initially Hold BTC & ETH

2024-08-18 00:28:11
Pixelverse Unveils Details of Phase 3 Launch, Introducing 'Gems' In-Game Currency

2024-08-18 00:25:10
RCO Finance (RCOF) and Shiba Inu (SHIB) Poised to Skyrocket by 4,000% Soon, Dogecoin (DOGE) Trader Reveals

2024-08-18 00:20:10

Latest Issues

How to run python script from HTML in google chrome? I'm building a chrome extension and I want to run a python script from my PC by clicking a...

From 2023-11-02 23:34:24

0

1

400

Why do some mysql connections select old data of mysql database after delete+insert? I have a problem with sessions in my python/wsgiweb application. Each thread in the 2 wsgi...

From 2023-10-30 12:37:20

0

2

229

Using variables to execute SQL statements in Python I have the following Python code: cursor.execute("INSERTINTOtableVALUESvar1,var2,var3...

From 2023-10-12 15:06:00

0

2

258

Understanding the ternary operator in Python [duplicate] I'm currently transitioning from JavaScript to Python, and I'm wondering if Python has a t...

From 2023-09-21 18:46:04

0

1

377

How to match strings with appended parts using Python, but not match them if their appended parts are different How to match strings with appended parts, but not match them if they have different append...

From 2023-09-20 19:02:23

0

1

260

Related Topics

More>

Popular Recommendations

Popular Tutorials

More>

Related Tutorials

Popular Recommendations

Latest courses

Latest Downloads

More>

Web Effects

Website Source Code

Website Materials

Front End Template

About us Disclaimer Sitemap: php.cn：Public welfare online PHP training，Help PHP learners grow quickly！