This article mainly introduces relevant information about Python login website details and examples. Friends who need it can refer to
Python login website details and examples
For In most forums, if we want to capture and analyze the posts, we need to log in first, otherwise we cannot view them.
This is because the HTTP protocol is a stateless protocol. How does the server know whether the user currently requesting the connection is logged in? There are two ways:
Explicitly use the Session ID in the URI;
Use cookies. The approximate process is to log in to a website. Keep a cookie locally. When you continue to browse this website, the browser will send the cookie along with the address request.
Python provides quite a wealth of modules, so this kind of network operation can be completed in just a few sentences. I take logging in to the QZZN forum as an example. In fact, the following procedure is applicable to almost all PHPWind type forums.
# -*- coding: GB2312 -*- from urllib import urlencode import cookielib, urllib2 # cookie cj = cookielib.LWPCookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) urllib2.install_opener(opener) # Login user_data = {'pwuser': '你的用户名', 'pwpwd': '你的密码', 'step':'2' } url_data = urlencode(user_data) login_r = opener.open("http://bbs.qzzn.com/login.php", url_data)
Some comments:
urllib2 is obviously a more advanced module than urllib, which includes how to use Cookies.
In urllib2, each client can be abstracted with an opener, and each opener can add multiple handlers to enhance its functionality.
HTTPCookieProcessor is specified as the handler when constructing the opener, so this handler supports Cookie.
After using isntall_opener, this opener will be used when calling urlopen.
If you do not need to save cookies, the cj parameter can be omitted.
user_data stores the information required for login. Just pass this information when logging into the forum.
The urlencode function is to encode the dictionary user_data into the form of "?pwuser=username&pwpwd=password". This is done to make the program easier to read.
The last question is, where do names like pwuser and pwpwd come from? This requires analyzing the web page that requires login. We know that the general login interface is a form, the excerpt is as follows:
<form action="login.php?" method="post" name="login" onSubmit="this.submit.disabled = true;"> <input type="hidden" value="" name="forward" /> <input type="hidden" value="http://bbs.qzzn.com/index.php" name="jumpurl" /> <input type="hidden" value="2" name="step" /> ... <td width="20%" onclick="document.login.pwuser.focus();"><input type="radio" name="lgt" value="0" checked />用户名 <input type="radio" name="lgt" value="1" />UID</td> <td><input class="input" type="text" maxLength="20" name="pwuser" size="40" tabindex="1" /> <a href="reg1ster.php" rel="external nofollow" >马上注册</a></td> <td>密 码</td> <td><input class="input" type="password" maxLength="20" name="pwpwd" size="40" tabindex="2" /> <a href="sendpwd.php" rel="external nofollow" target="_blank">找回密码</a></td> ... </form>
From here we can see that the username and password we need to enter correspond to pwuser and pwpwd, and step corresponds to login ( This is what I tried).
Notice that this forum form uses the post method. If it is the get method, the method in this article needs to be changed. It cannot be opened directly, but should be requested first and then opened. Please see the manual for more details...
The above is the detailed content of Detailed example of how to log in to the website using Python. For more information, please follow other related articles on the PHP Chinese website!