python - 从chrome中copy出来的XPath,在lxml.etree.HTML中无法生效是怎么回事?
怪我咯
怪我咯 2017-04-17 17:48:46
0
1
854

描述问题

想对html内容使用XPath选择器
步骤是:

  1. 种chrome右键得到XPath选择器

  2. 在lxml中使用

但是:

  1. 按理来说能选到, 但是返回的是空列表

上下文环境

python 2.7.11+ (default, Apr 17 2016, 14:00:29) [GCC 5.3.1 20160413] on linux2 pip show lxml --- Metadata-Version: 1.1 Name: lxml Version: 3.5.0 Summary: Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API. Home-page: http://lxml.de/ Author: lxml dev team Author-email: lxml-dev@lxml.de License: UNKNOWN Location: /usr/lib/python2.7/dist-packages Requires: Classifiers: Development Status :: 5 - Production/Stable Intended Audience :: Developers Intended Audience :: Information Technology License :: OSI Approved :: BSD License Programming Language :: Cython Programming Language :: Python :: 2 Programming Language :: Python :: 2.6 Programming Language :: Python :: 2.7 Programming Language :: Python :: 3 Programming Language :: Python :: 3.2 Programming Language :: Python :: 3.3 Programming Language :: Python :: 3.4 Programming Language :: Python :: 3.5 Programming Language :: C Operating System :: OS Independent Topic :: Text Processing :: Markup :: HTML Topic :: Text Processing :: Markup :: XML Topic :: Software Development :: Libraries :: Python Modules

重现

  1. 拷贝代码, 运行

  2. 注意代码中的url, 可以在chrome中做实验, 确实这个选择器, Firefox中XPath选择有所不同

相关代码

from __future__ import absolute_import, unicode_literals from lxml.etree import HTML import requests def get_text(url): return requests.get(url).text page = HTML(get_text('http://v2ex.com/?tab=hot')) print page.xpath('//*[@id="Main"]/p[2]/p[10]/table/tbody/tr/td[3]/span[1]/a') #这里没有选到内容, 按理来说要选到

报错信息

相关截图

已经尝试哪些方法仍然没有解决(附上相关链接)

  1. 猜测lxml的规则有所不同? (但是使用css选择器, 则没有问题)

问题简化

怪我咯
怪我咯

走同样的路,发现不同的人生

reply all (1)
洪涛

The answer may be this https://www.zhihu.com/question/41221020

What a trap. If you don’t search more, how much time will be wasted...

lxml’s official website layout is just a mess

    Latest Downloads
    More>
    Web Effects
    Website Source Code
    Website Materials
    Front End Template
    About us Disclaimer Sitemap
    php.cn:Public welfare online PHP training,Help PHP learners grow quickly!