python - How to crawl data from redirected websites
怪我咯
怪我咯 2017-05-19 10:07:30
0
2
1053

I am currently learning some knowledge about crawlers and using selenium to crawl some complex websites.
I encountered a problem. The work order website I need to crawl (I don’t know the password) needs to log in to an authentication system first, and then click on the work order system connection on the authentication system page, and it will automatically jump without logging in. Go to the work order system website. How should I use a crawler to crawl the data of this system?
The following is the html obtained by the authentication system selenium about the work order system

<a href="/link-test001" target="_blank" title="工单系统" rel="link-test001" data="1" datasrc="工单系统|||/files/link/test001.gif|||new|||/link-test001">
    <img src="/files/link/test001.gif" width="25" height="25" alt="工单系统" align="absmiddle"><span>工单系统</span>
</a>
怪我咯
怪我咯

走同样的路,发现不同的人生

reply all(2)
漂亮男人

Use selenium ide, a firefox extension, to record the operation.
Then export to python file.
Just change it and run it.

I suggest you read the book written by the insect master.

曾经蜡笔没有小新

For example, if you use the requests library as a crawler, create session() first, A logs in, and B is the page to jump to.

T=requests.session()
A=T.post(url=url,data=data)
B=T.get(url=url)

The created T represents the stored cookie, which will be retained forever

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template