Python crawls the data and gets a list, but how to remove the span tag in it?
我想大声告诉你
我想大声告诉你 2017-05-18 10:55:53
0
3
928

I used p6ython3.6 to crawl down some data, but what was finally displayed was a list containing span tags. When I used get_text, contents, etc., an error would be reported. Why is this?
The initial results returned are as follows:

[2017.5.2] [2017.4.26] [2017.4.24] [2017.4.19] [2017.3.23] [2017.3.17] [2017.2.14] [2017.2.9] [2017.2.6] [2017.2.6]

My code is as follows:

import requests from bs4 import BeautifulSoup import re # def url_list(): # for number in range(1,21): # url_links=[] # url="X".format(i=number) # url_links.append(url) h={"User-Agent":"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.81 Safari/537.36"} r=requests.get("url",headers=h) soup=BeautifulSoup(r.text,'lxml') for data in soup.find("p",{"class":"list-main-eventset-finan"}).find_all("li"): content=data.find("i",{"class":"cell date"}).find_all("span") print(time)
我想大声告诉你
我想大声告诉你

reply all (3)
仅有的幸福

I don’t remember the API of bs very clearly. There should be a function that can directly obtain the text. It should beget_text()这个函数吧。由于你用的是find_all(). Then I need to traverse the returned result again, that’s it

rs = list() for data in soup.find("p",{"class":"list-main-eventset-finan"}).find_all("li"): contents=data.find("i",{"class":"cell date"}).find_all("span") for content in contents: rs.append(content.get_text())

In addition, you can also use regular expressions to match directly(.*?)<this pattern. But you have to traverse the contens list as above.

    phpcn_u1582

    The questioner can try thetext_content()method

      左手右手慢动作

      Regular expressions or split+SUBSTRING can also be used, use them flexibly

        Latest Downloads
        More>
        Web Effects
        Website Source Code
        Website Materials
        Front End Template
        About us Disclaimer Sitemap
        php.cn:Public welfare online PHP training,Help PHP learners grow quickly!