Detailed explanation of examples of common commands used to access and crawl web pages in Python

Y2J
Release: 2017-04-25 09:22:13
Original
1917 people have browsed it

This article mainly introduces relevant information about common commands for python to access and crawl web pages. Friends who need it can refer to

Common commands for python to access and crawl web pages

Simple crawling of web pages:

import urllib.request  
url="http://google.cn/" 
response=urllib.request.urlopen(url)  #返回文件对象
page=response.read()
Copy after login

Save the URL directly as a local file:

import urllib.request  
url="http://google.cn/" 
response=urllib.request.urlopen(url)  #返回文件对象
page=response.read()
Copy after login

POST method:

import urllib.parse 
import urllib.request 
url="http://liuxin-blog.appspot.com/messageboard/add" 
values={"content":"命令行发出网页请求测试"} 
data=urllib.parse.urlencode(values) 

#创建请求对象 
req=urllib.request.Request(url,data) 
#获得服务器返回的数据 
response=urllib.request.urlopen(req) 
#处理数据 
page=response.read()
Copy after login

GET method:

import urllib.parse 
import urllib.request 
url="http://www.google.cn/webhp" 
values={"rls":"ig"} 
data=urllib.parse.urlencode(values) 
theurl=url+"?"+data 
#创建请求对象 
req=urllib.request.Request(theurl) 
#获得服务器返回的数据 
response=urllib.request.urlopen(req) 
#处理数据 
page=response.read()
Copy after login

There are two commonly used methods, geturl(), info()

geturl() is set to Identify whether there is a server-side URL redirection, and info() contains a series of information.

To handle Chinese problems, encode() encoding and dencode() decoding will be used:

The above is the detailed content of Detailed explanation of examples of common commands used to access and crawl web pages in Python. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!