Home >Backend Development >Python Tutorial >Learn how to use Python to capture videos with specified answers on Zhihu

Learn how to use Python to capture videos with specified answers on Zhihu

coldplay.xixiforward: 2020-07-09 17:31:182981browse

Preface

Now Zhihu allows uploading videos, but I can’t download videos. I’m so angry, so I researched in desperation. After a while, I typed the code to make it easier to download and save the video.

Next, why are cats not afraid of snakes at all? Answer as an example and share the entire download process.

Related learning recommendations: python video tutorial

Debug it

Open F12 , find the cursor, as shown below:

and then move the cursor to the video. As shown below:

Hey what is this? A mysterious link appeared in the field of vision: https://www.zhihu.com/video/xxxxx, let us copy this link to the browser, and then open it:

It seems that this is the video we are looking for, don’t worry, let’s take a look at the request of the web page, and then you will find a very interesting request (here comes the focus):

Let us take a look at the data ourselves:

{
	"playlist": {
		"ld": {
			"width": 360,
			"format": "mp4",
			"play_url": "https://vdn.vzuu.com/LD/05fc411e-d8e0-11e8-bb8b-0242ac112a0b.mp4?auth_key=1541477643-0-0-987c2c504d14ab1165ce2ed47759d927&expiration=1541477643&disable_local_cache=1",
			"duration": 17,
			"size": 1123111,
			"bitrate": 509,
			"height": 640
		},
		"hd": {
			"width": 720,
			"format": "mp4",
			"play_url": "https://vdn.vzuu.com/HD/05fc411e-d8e0-11e8-bb8b-0242ac112a0b.mp4?auth_key=1541477643-0-0-8b8024a22a62f097ca31b8b06b7233a1&expiration=1541477643&disable_local_cache=1",
			"duration": 17,
			"size": 4354364,
			"bitrate": 1974,
			"height": 1280
		},
		"sd": {
			"width": 480,
			"format": "mp4",
			"play_url": "https://vdn.vzuu.com/SD/05fc411e-d8e0-11e8-bb8b-0242ac112a0b.mp4?auth_key=1541477643-0-0-5948c2562d817218c9a9fc41abad1df8&expiration=1541477643&disable_local_cache=1",
			"duration": 17,
			"size": 1920976,
			"bitrate": 871,
			"height": 848
		}
	},
	"title": "",
	"duration": 17,
	"cover_info": {
		"width": 720,
		"thumbnail": "https://pic2.zhimg.com/80/v2-97b9435a0c32d01c7c931bd00120327d_b.jpg",
		"height": 1280
	},
	"type": "video",
	"id": "1039146361396174848",
	"misc_info": {}
}

Yes, the video we want to download is here, where ld represents common definition, sd represents standard definition, and hd represents high definition. Put the corresponding Open the link in the browser again, then right-click and save to download the video.

Code

If you know what the whole process looks like, the next process of coding will be simple. I won’t explain too much here, just go to the code:

# -*- encoding: utf-8 -*-

import re
import requests
import uuid
import datetime


class DownloadVideo:

  __slots__ = [
    &#39;url&#39;, &#39;video_name&#39;, &#39;url_format&#39;, &#39;download_url&#39;, &#39;video_number&#39;,
    &#39;video_api&#39;, &#39;clarity_list&#39;, &#39;clarity&#39;
  ]

  def __init__(self, url, clarity=&#39;ld&#39;, video_name=None):
    self.url = url
    self.video_name = video_name
    self.url_format = "https://www.zhihu.com/question/\d+/answer/\d+"
    self.clarity = clarity
    self.clarity_list = [&#39;ld&#39;, &#39;sd&#39;, &#39;hd&#39;]
    self.video_api = &#39;https://lens.zhihu.com/api/videos&#39;

  def check_url_format(self):
    pattern = re.compile(self.url_format)
    matches = re.match(pattern, self.url)
    if matches is None:
      raise ValueError(
        "链接格式应符合:https://www.zhihu.com/question/{number}/answer/{number}"
      )
    return True

  def get_video_number(self):
    try:
      headers = {
        &#39;User-Agent&#39;:
        &#39;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36&#39;
      }
      response = requests.get(self.url, headers=headers)
      response.encoding = &#39;utf-8&#39;
      html = response.text
      video_ids = re.findall(r&#39;data-lens-id="(\d+)"&#39;, html)
      if video_ids:
        video_id_list = list(set([video_id for video_id in video_ids]))
        self.video_number = video_id_list[0]
        return self
      raise ValueError("获取视频编号异常:{}".format(self.url))
    except Exception as e:
      raise Exception(e)

  def get_video_url_by_number(self):
    url = "{}/{}".format(self.video_api, self.video_number)

    headers = {}
    headers[&#39;Referer&#39;] = &#39;https://v.vzuu.com/video/{}&#39;.format(
      self.video_number)
    headers[&#39;Origin&#39;] = &#39;https://v.vzuu.com&#39;
    headers[
      &#39;User-Agent&#39;] = &#39;Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36&#39;
    headers[&#39;Content-Type&#39;] = &#39;application/json&#39;

    try:
      response = requests.get(url, headers=headers)
      response_dict = response.json()
      if self.clarity in response_dict[&#39;playlist&#39;]:
        self.download_url = response_dict[&#39;playlist&#39;][
          self.clarity][&#39;play_url&#39;]
      else:
        for clarity in self.clarity_list:
          if clarity in response_dict[&#39;playlist&#39;]:
            self.download_url = response_dict[&#39;playlist&#39;][
              self.clarity][&#39;play_url&#39;]
            break
      return self
    except Exception as e:
      raise Exception(e)

  def get_video_by_video_url(self):
    response = requests.get(self.download_url)
    datetime_str = datetime.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
    if self.video_name is not None:
      video_name = "{}-{}.mp4".format(self.video_name, datetime_str)
    else:
      video_name = "{}-{}.mp4".format(str(uuid.uuid1()), datetime_str)
    path = "{}".format(video_name)
    with open(path, &#39;wb&#39;) as f:
      f.write(response.content)

  def download_video(self):

    if self.clarity not in self.clarity_list:
      raise ValueError("清晰度参数异常,仅支持:ld(普清),sd(标清),hd(高清)")

    if self.check_url_format():
      return self.get_video_number().get_video_url_by_number().get_video_by_video_url()


if __name__ == &#39;__main__&#39;:
  a = DownloadVideo(&#39;https://www.zhihu.com/question/53031925/answer/524158069&#39;)
  print(a.download_video())

Conclusion

The code still has room for optimization. I just downloaded the first video in the answer. In theory, there should be more than one answer. A video. If you still have any questions or suggestions, you can communicate with us.

Related learning recommendations: python video tutorial

The above is the detailed content of Learn how to use Python to capture videos with specified answers on Zhihu. For more information, please follow other related articles on the PHP Chinese website!

Python https

Statement：

This article is reproduced at:jb51.net. If there is any infringement, please contact admin@php.cn delete

Previous article：Example analysis of Python unit testing and unittest framework usageNext article：Example analysis of Python unit testing and unittest framework usage

See more

Learn how to use Python to capture videos with specified answers on Zhihu

Related articles