python - Scrapy.FormRequest写对了吗？帮忙看看-PHP中文网问答

文章专题学习下载问答编程词典手游最近更新

简体中文(ZH-CN) English(EN) 繁体中文(ZH-TW) 日本語(JA) 한국어(KO) Melayu(MS) Français(FR) Deutsch(DE)

python - Scrapy.FormRequest写对了吗？帮忙看看

大家讲道理 2017-04-17 17:42:35

0

1

311

从m.zhihu.com/topics找到所有的话题的内容，点击“更多”，发现请求的是'https://m.zhihu.com/node/TopicsPlazzaListV2', 并且发送的FormData为：

    def get_topic_url(self, response):
        topics = response.css('.item .blk > a[target=_blank]::attr(href)').extract()
        _xsrf = response.css('input[name="_xsrf"]::attr(value)').extract()[0]
        for topic in topics:
            print topic
        data = response.css('.zh-general-list::attr(data-init)').extract()
        import json
        param = json.loads(data[0])
        topic_id = param['params']['topic_id']
        hash_id = param['params']['hash_id']
        offset = param['params']['offset']
        yield scrapy.FormRequest(
                url="https://m.zhihu.com/node/TopicsPlazzaListV2",
                headers=headers,
                formdata={
                    "method":"next",
                    "params":{
                        "topic_id":topic_id,
                        "offset":offset,
                        "hash_id":hash_id,
                    },
                "_xsrf":_xsrf,
                },
                meta={
                    "proxy": proxy,
                    "cookiejar": response.meta["cookiejar"],
                },
                callback=self.get_topic_url,
        )

但是返回的是400代码，是不是代码哪里写错了？请指教

2016-05-08 10:43:52 [scrapy] DEBUG: Retrying <POST https://m.zhihu.com/node/TopicsPlazzaListV2> (failed 1 times): 400 Bad Request
2016-05-08 10:43:53 [scrapy] DEBUG: Retrying <POST https://m.zhihu.com/node/TopicsPlazzaListV2> (failed 2 times): 400 Bad Request
2016-05-08 10:43:53 [scrapy] DEBUG: Gave up retrying <POST https://m.zhihu.com/node/TopicsPlazzaListV2> (failed 3 times): 400 Bad Request
2016-05-08 10:43:53 [scrapy] DEBUG: Crawled (400) <POST https://m.zhihu.com/node/TopicsPlazzaListV2> (referer: https://m.zhihu.com/topics)
2016-05-08 10:43:53 [scrapy] DEBUG: Ignoring response <400 https://m.zhihu.com/node/TopicsPlazzaListV2>: HTTP status code is not handled or not allowed

大家讲道理

光阴似箭催人老，日月如移越少年。

全部回复(1)

阿神2017-04-17 17:44:35 1 楼

把header设置成手机浏览器的header试试。

点赞 +0

添加回复

热门专题

更多>

热门文章

热门教程

更多>

相关教程

热门推荐

最新课程

最新ThinkPHP 5.1全球首发视频教程(60天成就PHP大牛线上培训班课)

1415763
php入门教程之一周学会PHP

4254788
JAVA 初级入门视频教程

2465060

最新下载

更多>

网站特效

网站源码

网站素材

前端模板

关于我们免责声明 Sitemap: PHP中文网：公益在线PHP培训，帮助PHP学习者快速成长！