python - 为什么明明我可以访问的网站, urlopen却会报 404: Not Found

Question

有的说是因为代理.我的浏览器倒是经常开着代理, 但是我已经关闭了. 我特意查看了下HTTP报文, 也都是没经过代理的.但还是会出错. 代码: {代码...} python版本: 3.5.1 报错信息:urllib.error.HTTPError: HTTP Error...

大家讲道理 · Answer

There is no problem with my python 3.5.2 under windows.
It is recommended that you capture the packet and compare it with the request when accessed by the browser.

Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32
>>> 
>>> 
>>> 
>>> import urllib.request
>>> url = "http://news.dbanotes.net/"
>>> req = urllib.request.Request(url)
>>> page = urllib.request.urlopen(req).read()
>>> page
b'

伊谢尔伦 · Answer

This may be related to the setting value of your agent, because some websites will check this to prevent non-browsers from crawling

巴扎黑 · Answer

You copy the headers and cookies from the browser and add them to the Request object of urllib.
Simulated browser~~

天蓬老师 · Answer

A very important reason is that the agent header you requested in your program has been blocked by the other party. Try changing the agent header.

阿神 · Answer

No need for Request, just urlopen directly

Php8, I'm coming too

Learn website layout in 30 minutes

Shangguan Oracle Beginner to Proficient Video Tutorial

Your first line of UNI-APP code

Flutter from scratch to app launch

Brother Lian New Linux Video Tutorial

AXURE 9 Video Tutorial (Suitable for Product Manager Interactive Product Design UI)

Zero Basic Proficiency PS Video Tutorial

16 day UI video tutorial to get you started

PS Techniques and Slicing Techniques Video Tutorial

Alibaba Cloud Environment Construction and Project Launch Video Tutorial

Overview of Computer Networks - Basic Knowledge that Programmers Must Master

Essential Tutorial for Programmers - HTTP Protocol Explanation

Websocket Video Tutorial