Scrapy 1.1.2 在 python3.4.4 安装成功。
并用了 Scrapy bench 作测试:
C:\Documents and Settings\Administrator>scrapy bench
2016-09-02 18:06:42 [scrapy] INFO: Scrapy 1.1.2 started (bot: scrapybot)
2016-09-02 18:06:42 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 1, 'LOG_LEVEL': 'INFO', 'CLOSESPIDER_TIMEOUT': 10}
2016-09-02 18:06:44 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats',
'scrapy.extensions.closespider.CloseSpider']
2016-09-02 18:06:45 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2016-09-02 18:06:45 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2016-09-02 18:06:45 [scrapy] INFO: Enabled item pipelines:
[]
2016-09-02 18:06:45 [scrapy] INFO: Spider opened
2016-09-02 18:06:45 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:46 [scrapy] INFO: Crawled 1 pages (at 60 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:47 [scrapy] INFO: Crawled 2 pages (at 60 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:49 [scrapy] INFO: Crawled 3 pages (at 60 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:49 [scrapy] INFO: Crawled 10 pages (at 420 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:52 [scrapy] INFO: Crawled 23 pages (at 780 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:54 [scrapy] INFO: Crawled 31 pages (at 480 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:55 [scrapy] INFO: Closing spider (closespider_timeout)
2016-09-02 18:06:55 [scrapy] INFO: Crawled 39 pages (at 480 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:56 [scrapy] INFO: Crawled 50 pages (at 660 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:57 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 15412,
'downloader/request_count': 50,
'downloader/request_method_count/GET': 50,
'downloader/response_bytes': 87156,
'downloader/response_count': 50,
'downloader/response_status_count/200': 50,
'finish_reason': 'closespider_timeout',
'finish_time': datetime.datetime(2016, 9, 2, 10, 6, 57, 218750),
'log_count/INFO': 15,
'request_depth_max': 4,
'response_received_count': 50,
'scheduler/dequeued': 50,
'scheduler/dequeued/memory': 50,
'scheduler/enqueued': 1001,
'scheduler/enqueued/memory': 1001,
'start_time': datetime.datetime(2016, 9, 2, 10, 6, 45, 609375)}
2016-09-02 18:06:57 [scrapy] INFO: Spider closed (closespider_timeout)
从反馈的信息上看,是成功的。
然后,我按照这个帖子做例子:scrapy简单学习
却提示出错了,如下:
C:\Documents and Settings\Administrator>scrapy crawl dmoz -o items.json
Scrapy 1.1.2 - no active project
Unknown command: crawl
Use "scrapy" to see available commands
有哪位知道具体怎么运用Scrapy 1.1.2 吗?
When scrapy runs your crawler project, you need to enter the project directory to run the command, otherwise it will not parse your command. For example, for my project Spiders, then cd Spiders, and then execute the command
How did you install scrapy