网页爬虫 - python+selenium+firefox爬虫,页面元素可以定位到,但是打印page_source却显示不出来该元素?爬虫
阿神
阿神 2017-04-18 10:02:58
0
3
1439

用python+selenium+firefox爬取网易云音乐指定音乐精彩评论,switch到iframe了,也可以定位到该元素,但是我打印driver.page_source却显示不全?


driver = webdriver.Firefox() driver.maximize_window() driver.set_page_load_timeout(10) try: driver.get("http://music.163.com/#/song?id=31877470") except selenium.common.exceptions.TimeoutException: print("time out of 10 s") driver.execute_script('window.stop()') print("休眠结束") driver.switch_to.frame("contentFrame") time.sleep(5) print(driver.find_element_by_id('comment-box').text) bsObj = BeautifulSoup(driver.page_source) print(driver.page_source)

这时候能通过driver输出精彩评论:

这是输出的page_source的部分截图,可以看到在p id="comment-box"后并没有精彩评论的内容,这部分源码没有。

阿神
阿神

闭关修行中......

reply all (3)
刘奇
#encoding=utf-8 from selenium import webdriver driver = webdriver.Chrome()#用的谷歌,到http://chromedriver.storage.googleapis.com/index.htm 下载 driver.maximize_window() driver.set_page_load_timeout(10) try: driver.get("http://music.163.com/#/song?id=31877470") except selenium.common.exceptions.TimeoutException: print("time out of 10 s") driver.execute_script('window.stop()') print(u"休眠结束") driver.switch_to.frame("contentFrame") time.sleep(5) print(driver.find_element_by_id('comment-box').text.encode('GBK', 'ignore')) bsObj = BeautifulSoup(driver.page_source) source = driver.page_source.encode('GBK', 'ignore') open('163.txt','w').write(source)#163.txt文件可以看到精彩评论的 #print(driver.page_source.encode('GBK', 'ignore'))
    黄舟

    I ran your code and found that it works. Look at the screenshot, it’s in the p of the grandson of who knows how many generations with p id="comment-box"

      刘奇

      Could you please use the driver.page_source method to obtain the source code of the webView embedded in the APP? Many labels are also empty? How to deal with it?

        Latest Downloads
        More>
        Web Effects
        Website Source Code
        Website Materials
        Front End Template
        About us Disclaimer Sitemap
        php.cn:Public welfare online PHP training,Help PHP learners grow quickly!