python - 如何爬取百度指数的数据?
怪我咯
怪我咯 2017-04-17 13:48:16
0
7
1375

百度指数的查询地址:http://index.baidu.com
比如说我输入:世界杯。查询到的结果如下:

查询到的结果数字并不是文本的形式,不知道应该如何抓取?

希望各位大神指教!

怪我咯
怪我咯

走同样的路,发现不同的人生

reply all(7)
迷茫

Ajax. Let’s take a look at the request return data in ajax

迷茫

F12, properly done
Find http://index.baidu.com/Interface/Search/getAllIndex/?res=azsWJCcMfgQgYQUpI2wmSz0GawFcHjoMKyIkMG0eYFYDXUspVARdQi03DiU6elRIMR0sRT8IElZhDBgYI11ZBT4xSlxdehQZNkZ1P0skBQcrDERiInxSBh EwGgMIc10aWUdVIwxREhNfZxs4PjE7Ag9eMG0PZDEQczUlA153HSY5CmNDaDRDaXMIeRhIMi5rN1YQVwoyBCVGBUQXZGJxAhdKJBhVH0pwFTRncXYfD0AUWypJLz4nJUczFw8jRXxdHRMwCx dhAHF7Fx8CKQ%3D%3D&res2=iMdY1W1TGQHmpyG9tZta9KatZf2VFnf1sQab3vylcHnlz95IvL491.2RTSXE73&startdate=2014-05-28&enddate=2014-06-26

左手右手慢动作

The general steps are as follows:

  1. First go to this page to log in and get the session cookie;
  2. Then use the obtained cookie to access the URL of the index query (using "google io" here): http://index.baidu.com/?tpl=trend&word=google+io;
  3. According to the Javascript obtained in the page, splice the relevant AJAX request URL, use the obtained cookie to request the URL again, and the returned content is what you want.

Third-party libraries that may be needed:

  • BeautifulSoup
  • scrapy
洪涛

Baidu is encrypted and difficult to crawl.
I saw a store on Taobao. . . http://t.cn/RhC1O6J

黄舟

We provide real-time crawling service of Baidu Index, please visit our online demo site: http://www.datadriver.info/scrapdata/,
We can share detailed cracking process and algorithm description for free, but we do not provide source code. You can also qq us, 2011193471

大家讲道理

https://item.taobao.com/item.htm?id=42837426371

刘奇

http://www.jianshu.com/p/361c97b4428a Free

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template