网页爬虫 - 爬虫遇到 redirect, 303 POST , 无法用 Python requests 完成任务
ringa_lee
ringa_lee 2017-04-17 17:10:53
0
1
713

小弟最近正在研究如何爬取这家机票资料 https://m.tigerair.com/booking/search

从 Chrome dev tool 看到从client 端连续发送出了两个类似的 requests

curl 'https://m.tigerair.com/booking/search' -H 'Cookie: PLAY_SESSION="1d7f16c847d5a596f468c9c0f764a8eabf83f48c-id=1585d391-c8a4-431d-8e24-edaa7bbaef57"; --data 'departureStation=SHE&arrivalStation=MAA&roundtrip=false&departureDate=2016-02-11&returnDate=&adults=1&children=0&infants=0&currency=CNY' --compressed

curl 'https://m.tigerair.com/booking/select' -H 'Cookie: PLAY_SESSION="30fdab1a897ba9ee088cba84ca28835efca28372-id=1585d391-c8a4-431d-8e24-edaa7bbaef57&searchForm=%7B%22currency%22%3A%22CNY%22%2C%22departureStation%22%3A%22SHE%22%2C%22arrivalStation%22%3A%22MAA%22%2C%22departureDate%22%3A1455148800000%2C%22children%22%3A0%2C%22adults%22%3A1%2C%22roundtrip%22%3Afalse%2C%22returnDate%22%3Anull%2C%22infants%22%3A0%2C%22switchMyFligthEnabled%22%3Afalse%7D"' -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: en-US,en;q=0.8' 

第一个request 看起来像是用POST 发送给 server打个交道,server回传cookie 要他记起来
接下来第二个 request 是GET 用来获取资料用

只是真是坑爹,我找了整个礼拜,就是无法用Python 复制这样的爬取行为。

看到谷歌大神有人建议用 request.session 也是没啥屁用

各位前辈高手,可以帮小弟给点方向吗? 小弟是用 Python requests 做开发的

ringa_lee
ringa_lee

ringa_lee

모든 응답(1)
PHPzhong

질문자님, 이 문단이 실행될 수 있는지 보실 수 있나요? 명령줄에서 시도하면서 복사했는데 괜찮을 것입니다.

으아아아
최신 다운로드
더>
웹 효과
웹사이트 소스 코드
웹사이트 자료
프론트엔드 템플릿