Home > Backend Development > Python Tutorial > Detailed introduction to the simple crawler function based on Python3.4

Detailed introduction to the simple crawler function based on Python3.4

巴扎黑
Release: 2017-09-16 10:16:36
Original
1583 people have browsed it

This article mainly introduces Python3.4 programming to implement simple crawling and crawler functions, involving Python3.4 web page crawling and regular parsing related operating techniques. Friends in need can refer to the following

The examples of this article are described Python3.4 programming implements simple crawler function. Share it with everyone for your reference, the details are as follows:


import urllib.request
import urllib.parse
import re
import urllib.request,urllib.parse,http.cookiejar
import time
def getHtml(url):
  cj=http.cookiejar.CookieJar()
  opener=urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
  opener.addheaders=[('User-Agent','Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36'),('Cookie','4564564564564564565646540')]
  urllib.request.install_opener(opener)
  page = urllib.request.urlopen(url)
  html = page.read()
  return html
#print ( html)
#html = getHtml("http://weibo.com/")
def getimg(html):
  html = html.decode('utf-8')
  reg='"screen_name":"(.*?)"'
  imgre = re.compile(reg)
  src=re.findall(imgre,html)
  return src
#print ("",getimg(html))
uid=['2808675432','3888405676','2628551531','2808587400']
for a in list(uid):
  print (getimg(getHtml("http://weibo.com/"+a)))
  time.sleep(1)
Copy after login

The above is the detailed content of Detailed introduction to the simple crawler function based on Python3.4. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template