Python：Scrapy中重写ImagePipeline组件的file_path函数，但没被调用-PHP Chinese Network Q&A

环境

Python：2.7.6(64位)
Scrapy：0.22.2(64位)
操作系统：Windows7(64位)

问题需求

默认情况下，使用ImagePipeline组件下载图片的时候，图片名称是以图片URL的SHA1值进行保存的。
如：
图片URL:http://www.example.com/image.jpg
SHA1结果：3afec3b4765f8f0a07b78f98c07b83f013567a0a
则图片名称：3afec3b4765f8f0a07b78f98c07b83f013567a0a.jpg
但是，我想要以原来的图片名称进行保存，比如上面例子中的图片保存到本地的话，图片名称就应该是：image.jpg
stackoverflow上说是可以重写image_key函数，不过我试了下，结果发现不行，重写的image_key函数没被调用。后面查看了下ImagePipeline的源码：

class ImagesPipeline(FilesPipeline): """Abstract pipeline that implement the image thumbnail generation logic """ MEDIA_NAME = 'image' MIN_WIDTH = 0 MIN_HEIGHT = 0 THUMBS = {} DEFAULT_IMAGES_URLS_FIELD = 'image_urls' DEFAULT_IMAGES_RESULT_FIELD = 'images' ...省略 def file_path(self, request, response=None, info=None): ## start of deprecation warning block (can be removed in the future) def _warn(): from scrapy.exceptions import ScrapyDeprecationWarning import warnings warnings.warn('ImagesPipeline.image_key(url) and file_key(url) methods are deprecated, ' 'please use file_path(request, response=None, info=None) instead', category=ScrapyDeprecationWarning, stacklevel=1) # check if called from image_key or file_key with url as first argument if not isinstance(request, Request): _warn() url = request else: url = request.url # detect if file_key() or image_key() methods have been overridden if not hasattr(self.file_key, '_base'): _warn() return self.file_key(url) elif not hasattr(self.image_key, '_base'): _warn() return self.image_key(url) ## end of deprecation warning block image_guid = hashlib.sha1(url).hexdigest() # change to request.url after deprecation return 'full/%s.jpg' % (image_guid) # deprecated def image_key(self, url): return self.file_path(url) image_key._base = True ...省略

其中，有这么一句话：
ImagesPipeline.image_key(url) and file_key(url) methods are deprecated, please use file_path(request, response=None, info=None) instead
也就是说，在最新版本的Scrapy中（0.22.2），使用file_path代替image_key函数。
因此，我在自定义的ImagePipeline类中，重写了file_path函数，但是结果运行的时候，发现也没法被调用。
代码如下：

from scrapy.contrib.pipeline.images import ImagesPipeline from scrapy.exceptions import DropItem from scrapy.http import Request import os class DownPhotosPipeline(ImagesPipeline): def file_path(self, request): print "~~~~~~~~~~~~~~~~~~~~~~" print "~~~~~~~"+request.url+"~~~~~~~" print "~~~~~~~~~~~~~~~~~~~~~~" image_guid = request.url.split('/')[-1] return 'full/%s' % (image_guid) def get_media_requests(self, item, info): for image_url in item['images']: yield Request(image_url) def item_completed(self, results, item, info): image_paths = [x['path'] for ok, x in results if ok] if not image_paths: raise DropItem("Item contains no images") #item['image_paths'] = image_paths return item

settings.py

DOWNLOAD_DELAY = 2 IMAGES_STORE = 'budejie_photos' DOWNLOAD_TIMEOUT = 1200 ITEM_PIPELINES = ['scrapy.contrib.pipeline.images.ImagesPipeline' ]

Php8, I'm coming too

Learn website layout in 30 minutes

Shangguan Oracle Beginner to Proficient Video Tutorial

Your first line of UNI-APP code

Flutter from scratch to app launch

Brother Lian New Linux Video Tutorial

AXURE 9 Video Tutorial (Suitable for Product Manager Interactive Product Design UI)

Zero Basic Proficiency PS Video Tutorial

16 day UI video tutorial to get you started

PS Techniques and Slicing Techniques Video Tutorial

Alibaba Cloud Environment Construction and Project Launch Video Tutorial

Overview of Computer Networks - Basic Knowledge that Programmers Must Master

Essential Tutorial for Programmers - HTTP Protocol Explanation

Websocket Video Tutorial

环境

问题需求

settings.py