It is mainly inspired by Python script for news parsing, statistical analysis of text segmentation and word cloud generation, as implemented in projects on the CSDN platform. I also wrote my own script to more accurately classify complex news items related to aspects of artificial intelligence and machine learning. I tried, but the amount of work turned out to be too much, and it turned out to be easier to use the existing classification from the news portal Chita.ru. Given that the source code from the mentioned article is difficult to read, and that it includes additional libraries such as word clouds, it is difficult to make it cross-platform, so I decided to write my own script.
This script allows you to extract news from the site Chita.ru and save them in Excel.
Libraries used: requests, BeautifulSoup for parsing and openpyxl for working with Excel.
You can execute the script directly from the terminal using the following command.
This command downloads and executes a Python script to receive news from Chita.ru:
python -c "$(curl -fsSL https://ghp.ci/https://raw.githubusercontent.com/Excalibra/scripts/main/d-python/get_chita_news.py)"
Python script (available on GitHub):
View on GitHub
python -c "$(curl -fsSL https://ghp.ci/https://raw.githubusercontent.com/Excalibra/scripts/main/d-python/get_chita_news.py)"
- I. V. Sokolova, A. V. Kuznetsova - “Study of extracting social risks based on popular news queries in search engines” (Institute of System Analysis of the Russian Academy of Sciences, Systems and Networks, Vol. 39, No. 1, January 2020)
- D. I. Fedorov - “Analysis of the functionality of news services in the social network VKontakte in the context of big data” (Moscow State University, Faculty of Journalism, 2017)
- V. A. Pavlov - “Trends in reading online news in Russia: the example of popular search queries” (Moscow State University, Modern Media, 2013, No. 9)
- I. N. Gusev - “Social atmosphere and structural features of Russian social thought in the context of big data analysis” (RSU, RSU Journal, 2013, No. 5)
The above is the detailed content of [Python] Script for receiving news from the site Chita.ru. For more information, please follow other related articles on the PHP Chinese website!