Article Topic Learning Download Q&A Programming Dictionary Game Recent Updates

简体中文(ZH-CN) English(EN) 繁体中文(ZH-TW) 日本語(JA) 한국어(KO) Melayu(MS) Français(FR) Deutsch(DE)

Home > Backend Development > Python Tutorial > body text

Python使用scrapy采集数据时为每个请求随机分配user-agent的方法

WBOY

Release： 2016-06-06 11:23:57

Original

1306 people have browsed it

本文实例讲述了Python使用scrapy采集数据时为每个请求随机分配user-agent的方法。分享给大家供大家参考。具体分析如下：

通过这个方法可以每次请求更换不同的user-agent，防止网站根据user-agent屏蔽scrapy的蜘蛛

首先将下面的代码添加到settings.py文件，替换默认的user-agent处理模块

代码如下:

DOWNLOADER_MIDDLEWARES = {
'scraper.random_user_agent.RandomUserAgentMiddleware': 400,
'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
}

自定义useragent处理模块

代码如下:

from scraper.settings import USER_AGENT_LIST
import random
from scrapy import log
class RandomUserAgentMiddleware(object):
    def process_request(self, request, spider):
        ua = random.choice(USER_AGENT_LIST)
        if ua:
            request.headers.setdefault('User-Agent', ua)
        #log.msg('>>>> UA %s'%request.headers)

希望本文所述对大家的Python程序设计有所帮助。

Related labels：

python scrapy Data collection

source：php.cn

Previous article：Python字符串和文件操作常用函数分析 Next article：介绍Python中几个常用的类方法

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Latest Articles by Author

BlackRock Labels BTC a Unique Diversifier

2024-09-20 15:51:33
Internet Computer (ICP) Price Prediction: Will ICP Price Hit $24?

2024-09-20 15:47:32
Worldcoin (WLD) Price Prediction 2022-23

2024-09-20 15:45:32
Top Meme Coins to Invest In Today

2024-09-20 15:39:32
Floki (FLOKI) Price Prediction: Will the Revamped Marketing Help Floki Catch Up on October Gains?

2024-09-20 15:38:32
Next Cryptocurrency to Explode: 5 Coins to Add to Your Watchlist

2024-09-20 15:27:32
Dogecoin: From an Internet Meme to a Digital Currency with a Billion-Dollar Market Capitalization

2024-09-20 15:26:32
ZChains Unveils a Series of Exciting Updates and Launches to Enhance Its Ecosystem

2024-09-20 15:12:32
How to download the Apple version of Little Fox Payment Platform

2024-09-20 14:53:01
How beginners trade on MetaMask and its advantages and disadvantages

2024-09-20 14:51:01

Latest Issues

Python/MySQL cannot persist integer data correctly No code is required here. I want to save a very long number because I'm making a game and ...

From 2024-04-04 19:09:44

0

1

367

Using selenium want to click and define URL in class I need another tip today. I'm trying to build Python/Selenium code and the idea is to clic...

From 2024-04-04 14:14:44

0

1

3492

Selenium + Python - inspect image via execute_script I need to verify that an image is displayed on the page using selenium in python. For exam...

From 2024-04-03 09:32:15

0

1

375

How to keep the first X rows and delete table rows I have a big table with millions of records in MySQLincident_archive, I want to sort the r...

From 2024-04-01 18:32:54

0

1

347

How to scrape specific Google Weather text using BeautifulSoup? How to find the course text "New York City, USA" in Python using BeautifulSoup? ...

From 2024-04-01 14:06:14

0

1

308

Related Topics

More>

Popular Recommendations

Popular Tutorials

More>

Related Tutorials

Popular Recommendations

Latest courses

Latest Downloads

More>

Web Effects

Website Source Code

Website Materials

Front End Template

About us Disclaimer Sitemap: php.cn：Public welfare online PHP training，Help PHP learners grow quickly！