사용 Python模拟登陆一个网站,一直遇到404问题,求指导!
代码
scrapy 가져오기
from scrapy.http 가져오기 요청, FormRequest
from scrapy.selector 가져오기 선택기
클래스 StackSpiderSpider(scrapy.Spider):
으아아아调试信息
2017-04-18 11:19:23 [scrapy.utils.log] INFO: Scrapy 1.3.3 시작됨(봇: text5)
2017-04-18 11:19:23 [scrapy.utils.log ] INFO: 재정의된 설정: {'NEWSPIDER_MO
DULE': 'text5.spiders', 'SPIDER_MODULES': ['text5.spiders'], 'BOT_NAME': 'text5'
}
2017-04-18 11:19: 23 [scrapy.middleware] 정보: 활성화된 확장:
['scrapy.extensions.logstats.LogStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.corestats.CoreStats']
2017-04- 18 11:19:24 [scrapy.middleware] 정보: 활성화된 다운로더 미들웨어:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware' ,
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.download 어미들웨어 .redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-04-18 11:19:24 [scrapy.middleware] 정보: 활성화된 스파이더 미들웨어:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermi 아이들웨어 .length.DepthMiddleware']
2017-04-18 11:19:24 [scrapy.middleware] INFO: 활성화된 항목 파이프라인:
[]
2017-04-18 11:19:24 [scrapy.core.engine] INFO : 스파이더 열림
2017-04-18 11:19:24 [scrapy.extensions.logstats] INFO: 0페이지 크롤링(0페이지
es/분), 0개 항목 스크래핑(0개 항목/분)
2017-04 -18 11:19:24 [scrapy.extensions.telnet] DEBUG: Telnet 콘솔 수신 o
n 127.0.0.1:6023
2017-04-18 11:19:24 [scrapy.core.engine] DEBUG: Crawled (200 ) <https://stack
overflow.com/users/login> (참조: 없음)
1145f3f2e28e56c298bc28a1a735254b
2017-04-18 11:19:25 [scrapy.core.engine] 디버그: 크롤링(404) <GET https://stack
overflow.com/search?q=&ssrc=&openid_username=&oauth_server=&oauth_version=&fkey =
1145f3f2e28e56c298bc28a1a735254b&password=wanglihong1993&email=1067863906%40qq.c
om&openid_identifier=> (참조: https://stackoverflow.com/use...
2017-04-18 11:19:25 [scrapy.spidermiddlewares.httperror] INFO: 응답 무시
<404 https://stackoverflow.com/sea ...
auth_version=&fkey=1145f3f2e28e56c298bc28a1a735254b&password=wanglihong1993&emai
l=1067863906%40qq.com&openid_identifier=>: HTTP 상태 코드가 처리되지 않거나
t 허용되지 않습니다
2017-04-18 1 1:19:25 [scrapy.core. 엔진] INFO: Closing spider(완료)
2017-04-18 11:19:25 [scrapy.statscollectors] INFO: Dumping Scrapy 통계:
{'downloader/request_bytes': 881,
'downloader/request_count': 2,
'다운로더/요청_방법_수/GET': 2,
'다운로더/응답_바이트': 12631,
'다운로더/응답_수': 2,
'다운로더/응답_상태_수/200': 1,
'다운로더/응답_상태_수/404': 1 ,
'finish_reason': '완료',
'finish_time': datetime.datetime(2017, 4, 18, 3, 19, 25, 143000),
'log_count/DEBUG': 3,
'log_count/INFO': 8,
'request_length_max': 1,
'response_received_count': 2,
'스케줄러/큐에서 제거됨': 2,
'스케줄러/큐에서 제거됨/메모리': 2,
'스케줄러/큐에 추가됨': 2,
'스케줄러/큐에 추가됨 /memory': 2,
'start_time': datetime.datetime(2017, 4, 18, 3, 19, 24, 146000)}
2017-04-18 11:19:25 [scrapy.core.engine] 정보: 스파이더 폐쇄(완료)
형님, 비밀번호가 유출됐어요