社群學習工具庫休閒

繁体中文

首頁 > 後端開發 > Python教學 > 詳解用python的BeautifulSoup分析html方法

詳解用python的BeautifulSoup分析html方法

高洛峰

發布： 2017-03-31 11:36:53

原創

1601 人瀏覽過

1) 搜尋tag：

find(tagname)        # 直接搜尋名為tagname的tag 如：find('head')
find (list)           # 搜尋在list中的tag，如: find(['head', 'body'])
find(dict)       {'head':True, 'body':True})
find(re.compile('')) # 搜尋符合正規則的tag, 如:find(re.compile('^p')) 搜尋以p開頭的tag
find(lambda)         # 搜尋函數返回結果為true的tag, 如:find(lambda name: if len(name) == 1) 搜尋長度為1的tag
find(True)           # 搜尋所有tag

2) 搜尋文字（text）

3) recursive, limit:

from bs4 import BeautifulSoup
import re
 
doc = ['<html><head><title>Page title</title></head>',
       '<body><p id="firstpara" align="center">This is paragraph <b>one</b>.',
       '<p id="secondpara" align="blah">This is paragraph <b>two</b>.',
       '</html>']
soup = BeautifulSoup(''.join(doc))
 
print soup.prettify()+"\n"
print soup.findAll('b')
 
print soup.findAll(text=re.compile("paragraph"))
print soup.findAll(text=True)
print soup.findAll(text=lambda(x):len(x)<12)
 
a = soup.findAll(re.compile('^b'))
print [tag.name for tag in a]
 
print [tag.name for tag in soup.html.findAll()]
print [tag.name for tag in soup.html.findAll(recursive=False)]
 
print soup.findAll('p',limit=1)

登入後複製

以上是詳解用python的BeautifulSoup分析html方法的詳細內容。更多資訊請關注PHP中文網其他相關文章！

相關標籤：

python

來源：php.cn

上一篇：詳解使用Python對Excel進行讀寫操作方法下一篇：Python基礎內容：函數

本網站聲明

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

作者最新文章

html設定加粗、傾斜、底線、刪除線等字型效果範例介紹

1970-01-01 08:00:00
實作一個 Java 版的 Redis

1970-01-01 08:00:00
最簡單的微信小程式Demo

1970-01-01 08:00:00
python中pandas.DataFrame（建立、索引、增加與刪除）的簡單操作方法介紹

1970-01-01 08:00:00
微信小程式：如何實作tabs選項卡效果範例

1970-01-01 08:00:00
Python建構自訂方法來美化字典結構輸出

1970-01-01 08:00:00
HTML5:使用Canvas即時處理Video

1970-01-01 08:00:00
Asp.net使用SignalR實作傳送圖片

1970-01-01 08:00:00
微信小程式開發教程-App()和Page()函數概述

1970-01-01 08:00:00
詳解python redis使用方法

1970-01-01 08:00:00

最新問題

Python/MySQL無法正確持久化整數數據在這裡不需要任何程式碼。我想要保存一個非常長的數字，因為我正在製作一個遊戲，需要保存分數。但是我測試了一下，將分數設定為25000000000，但在mysql中儲存為2147483...

來自於 2024-04-04 19:09:44

0

1

367

使用selenium想要點擊並在類別中定義URL 今天我需要另一個提示。我正在嘗試建立Python/Selenium程式碼，想法是點擊www.thewebsiteIwantoclickon下面是我正在處理的HTML範例。類別ent...

來自於 2024-04-04 14:14:44

0

1

3492

Selenium + Python - 透過execute_script檢查映像我需要使用python中的selenium驗證圖片是否顯示在頁面上。例如，讓我們檢查https://openweathermap.org/頁面左上角的標誌。我使用execute_s...

來自於 2024-04-03 09:32:15

0

1

375

保留前X行，刪除表格行的方法我在MySQLincident_archive中有一個包含數百萬筆記錄的大表，我想按created列對行進行排序，並保留前X行並刪除其餘行，最有效的方法是什麼。到目前為止，我用Py...

來自於 2024-04-01 18:32:54

0

1

347

如何使用 BeautifulSoup 抓取特定的Google天氣文字？如何使用BeautifulSoup在Python中找到課程文本“美國紐約市”？嘗試複製影片進行練習，但不再有效。嘗試在官方文件中找到一些內容，但沒有成功。或者我的get_html_...

來自於 2024-04-01 14:06:14

0

1

308

相關專題

更多>

熱門推薦

熱門教學

更多>

相關教學

熱門推薦

最新課程

最新ThinkPHP 5.1全球首發影片教學(60天成就PHP大牛線上訓練課程)

1422992
php入門教程之一週學會PHP

4268359
JAVA 初級入門影片教學

2533986
小甲魚零基礎入門學習Python影片教學

507249
PHP 零基礎入門教學

862382

最新下載

更多>

網站特效

網站源碼

網站素材

前端模板