python - 如何爬取百度指数的数据？

Question

百度指数的查询地址：http://index.baidu.com
比如说我输入：世界杯。查询到的结果如下：

查询到的结果数字并不是文本的形式，不知道应该如何抓取？

希望各位大神指教！

迷茫 · Answer

Ajax. Let’s take a look at the request return data in ajax

迷茫 · Answer

F12, properly done
Find http://index.baidu.com/Interface/Search/getAllIndex/?res=azsWJCcMfgQgYQUpI2wmSz0GawFcHjoMKyIkMG0eYFYDXUspVARdQi03DiU6elRIMR0sRT8IElZhDBgYI11ZBT4xSlxdehQZNkZ1P0skBQcrDERiInxSBh EwGgMIc10aWUdVIwxREhNfZxs4PjE7Ag9eMG0PZDEQczUlA153HSY5CmNDaDRDaXMIeRhIMi5rN1YQVwoyBCVGBUQXZGJxAhdKJBhVH0pwFTRncXYfD0AUWypJLz4nJUczFw8jRXxdHRMwCx dhAHF7Fx8CKQ%3D%3D&res2=iMdY1W1TGQHmpyG9tZta9KatZf2VFnf1sQab3vylcHnlz95IvL491.2RTSXE73&startdate=2014-05-28&enddate=2014-06-26

ringa_lee · Answer

The general steps are as follows:

First go to this page to log in and get the session cookie;
Then use the obtained cookie to access the URL of the index query (using "google io" here): http://index.baidu.com/?tpl=trend&word=google+io;
According to the Javascript obtained in the page, splice the relevant AJAX request URL, use the obtained cookie to request the URL again, and the returned content is what you want.

Third-party libraries that may be needed:

BeautifulSoup
scrapy

PHP中文网 · Answer

Baidu is encrypted and difficult to crawl.
I saw a store on Taobao. . . http://t.cn/RhC1O6J

黄舟 · Answer

We provide real-time crawling service of Baidu Index, please visit our online demo site: http://www.datadriver.info/scrapdata/,
We can share detailed cracking process and algorithm description for free, but we do not provide source code. You can also qq us, 2011193471

大家讲道理 · Answer

https://item.taobao.com/item.htm?id=42837426371

怪我咯 · Answer

http://www.jianshu.com/p/361c97b4428a Free