Community Learn Tools Library Leisure

English

Home > Backend Development > Python Tutorial > How to implement Python to crawl website code examples that require login

How to implement Python to crawl website code examples that require login

黄舟

Release： 2017-08-20 10:26:40

Original

3504 people have browsed it

This article mainly introduces Python to implement crawling of websites that need to be logged in. It combines a complete example to analyze the Python login website and data capture related operation skills. Friends in need can refer to the following

Examples of this article Python implementation method for crawling websites that require login. Share it with everyone for your reference, the details are as follows:

import requests
from lxml import html
# 创建 session 对象。这个对象会保存所有的登录会话请求。
session_requests = requests.session()
# 提取在登录时所使用的 csrf 标记
login_url = "https://bitbucket.org/account/signin/?next=/"
result = session_requests.get(login_url)
tree = html.fromstring(result.text)
authenticity_token = list(set(tree.xpath("//input[@name=&#39;csrfmiddlewaretoken&#39;]/@value")))[0]
payload = {
  "username": "<你的用户名>",
  "password": "<你的密码>",
  "csrfmiddlewaretoken": authenticity_token # 在源代码中，有一个名为 “csrfmiddlewaretoken” 的隐藏输入标签。
}
# 执行登录
result = session_requests.post(
  login_url,
  data = payload,
  headers = dict(referer=login_url)
)
# 已经登录成功了，然后从 bitbucket dashboard 页面上爬取内容。
url = &#39;https://bitbucket.org/dashboard/overview&#39;
result = session_requests.get(
  url,
  headers = dict(referer = url)
)
# 测试爬取的内容
tree = html.fromstring(result.content)
bucket_elems = tree.findall(".//span[@class=&#39;repo-name&#39;]/")
bucket_names = [bucket.text_content.replace("n", "").strip() for bucket in bucket_elems]
print(bucket_names)

Copy after login

The above is the detailed content of How to implement Python to crawl website code examples that require login. For more information, please follow other related articles on the PHP Chinese website!

Related labels：

python Log in need

source：php.cn

Previous article：Detailed explanation on the use of str and repr in Python Next article：Python uses four methods to achieve comparative analysis of all links in the current page

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Latest Articles by Author

Video material on building your own PHP framework from scratch

2023-03-15 16:54:01
Example analysis of how PHPMailer uses QQ mailbox to complete the email sending function

2023-03-15 12:26:02
Introduction to how to receive emails in IMAP in php

2023-03-14 18:58:01
Example of how to quickly implement array deduplication in PHP

2023-03-14 11:30:01
Summary of the use of all attributes of the tag in html

1970-01-01 08:00:00
Summary of basic knowledge of PHP (necessary for beginners to get started)

2023-03-16 15:20:01
Introduction to the use of typeof in JavaScript

1970-01-01 08:00:00
Introduction to the use of confirm() method in JavaScript

1970-01-01 08:00:00
A detailed introduction to the HTML5 Placeholder attribute

1970-01-01 08:00:00
How to implement single-select, multiple-select and reverse-select in forms in ReactJS

1970-01-01 08:00:00

Latest Issues

Python/MySQL cannot persist integer data correctly No code is required here. I want to save a very long number because I'm making a game and ...

From 2024-04-04 19:09:44

0

1

367

Using selenium want to click and define URL in class I need another tip today. I'm trying to build Python/Selenium code and the idea is to clic...

From 2024-04-04 14:14:44

0

1

3492

Selenium + Python - inspect image via execute_script I need to verify that an image is displayed on the page using selenium in python. For exam...

From 2024-04-03 09:32:15

0

1

375

How to keep the first X rows and delete table rows I have a big table with millions of records in MySQLincident_archive, I want to sort the r...

From 2024-04-01 18:32:54

0

1

347

How to scrape specific Google Weather text using BeautifulSoup? How to find the course text "New York City, USA" in Python using BeautifulSoup? ...

From 2024-04-01 14:06:14

0

1

308

Related Topics

More>

Popular Recommendations

Popular Tutorials

More>

Related Tutorials

Popular Recommendations

Latest courses

Latest Downloads

More>

Web Effects

Website Source Code

Website Materials

Front End Template