Home Backend Development Python Tutorial python basic tutorial project four news aggregation

python basic tutorial project four news aggregation

Apr 03, 2018 am 09:17 AM
python news project

This article mainly introduces the news aggregation of the python basic tutorial project in detail. It has a certain reference value. Interested friends can refer to the

"Python Basic Tutorial" book. The fourth exercise is news aggregation. A type of application that is rare nowadays, at least I have never used it, is also called Usenet. The main function of this program is to collect information from specified sources (here, Usenet newsgroups), and then save this information to specified destination files (two forms are used here: plain text and html files). The use of this program is somewhat similar to the current blog subscription tool or RSS subscriber.

First introduce the code, and then analyze it one by one:

from nntplib import NNTP
from time import strftime,time,localtime
from email import message_from_string
from urllib import urlopen
import textwrap
import re
day = 24*60*60
def wrap(string,max=70):
    '''
    '''
    return '\n'.join(textwrap.wrap(string)) + '\n'
class NewsAgent:
    '''
    '''
    def __init__(self):
        self.sources = []
        self.destinations = []
    def addSource(self,source):
        self.sources.append(source)
    def addDestination(self,dest):
        self.destinations.append(dest)
    def distribute(self):
        items = []
        for source in self.sources:
            items.extend(source.getItems())
        for dest in self.destinations:
            dest.receiveItems(items)
class NewsItem:
    def __init__(self,title,body):
        self.title = title
        self.body = body
class NNTPSource:
    def __init__(self,servername,group,window):
        self.servername = servername
        self.group = group
        self.window = window
    def getItems(self):
        start = localtime(time() - self.window*day)
        date = strftime('%y%m%d',start)
        hour = strftime('%H%M%S',start)
        server = NNTP(self.servername)
        ids = server.newnews(self.group,date,hour)[1]
        for id in ids:
            lines = server.article(id)[3]
            message = message_from_string('\n'.join(lines))
            title = message['subject']
            body = message.get_payload()
            if message.is_multipart():
                body = body[0]
            yield NewsItem(title,body)
        server.quit()
class SimpleWebSource:
    def __init__(self,url,titlePattern,bodyPattern):
        self.url = url
        self.titlePattern = re.compile(titlePattern)
        self.bodyPattern = re.compile(bodyPattern)
    def getItems(self):
        text = urlopen(self.url).read()
        titles = self.titlePattern.findall(text)
        bodies = self.bodyPattern.findall(text)
        for title.body in zip(titles,bodies):
            yield NewsItem(title,wrap(body))
class PlainDestination:
    def receiveItems(self,items):
        for item in items:
            print item.title
            print '-'*len(item.title)
            print item.body
class HTMLDestination:
    def __init__(self,filename):
        self.filename = filename
    def receiveItems(self,items):
        out = open(self.filename,'w')
        print >> out,'''
        <html>
        <head>
         <title>Today&#39;s News</title>
        </head>
        <body>
        <h1>Today&#39;s News</hi>
        &#39;&#39;&#39;
        print >> out, &#39;<ul>&#39;
        id = 0
        for item in items:
            id += 1
            print >> out, &#39;<li><a href="#" rel="external nofollow" >%s</a></li>&#39; % (id,item.title)
        print >> out, &#39;</ul>&#39;
        id = 0
        for item in items:
            id += 1
            print >> out, &#39;<h2><a name="%i">%s</a></h2>&#39; % (id,item.title)
            print >> out, &#39;<pre class="brush:php;toolbar:false">%s
' % item.body print >> out, ''' ''' def runDefaultSetup(): agent = NewsAgent() bbc_url = 'http://news.bbc.co.uk/text_only.stm' bbc_title = r'(?s)a href="[^" rel="external nofollow" ]*">\s*\s*(.*?)\s*' bbc_body = r'(?s)\s*
\s*(.*?)\s*<' bbc = SimpleWebSource(bbc_url, bbc_title, bbc_body) agent.addSource(bbc) clpa_server = 'news2.neva.ru' clpa_group = 'alt.sex.telephone' clpa_window = 1 clpa = NNTPSource(clpa_server,clpa_group,clpa_window) agent.addSource(clpa) agent.addDestination(PlainDestination()) agent.addDestination(HTMLDestination('news.html')) agent.distribute() if __name__ == '__main__': runDefaultSetup()

This program will first be analyzed as a whole. The key part is NewsAgent, which is used to store news sources, store target addresses, and then call the source servers (NNTPSource and SimpleWebSource) and the classes for writing news (PlainDestination and HTMLDestination) respectively. So it can be seen from here that NNTPSource is specially used to obtain information on the news server, and SimpleWebSource is used to obtain data on a url. The functions of PlainDestination and HTMLDestination are obvious. The former is used to output the obtained content to the terminal, and the latter is used to write data to the html file.

With these analyses, let’s look at the contents of the main program. The main program is to add information sources and output destination addresses to NewsAgent.

This is indeed a simple program, but this program uses layering.


The above is the detailed content of python basic tutorial project four news aggregation. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

PHP Tutorial
1538
276
How to automate data entry from Excel to a web form with Python? How to automate data entry from Excel to a web form with Python? Aug 12, 2025 am 02:39 AM

The method of filling Excel data into web forms using Python is: first use pandas to read Excel data, and then use Selenium to control the browser to automatically fill and submit the form; the specific steps include installing pandas, openpyxl and Selenium libraries, downloading the corresponding browser driver, using pandas to read Name, Email, Phone and other fields in the data.xlsx file, launching the browser through Selenium to open the target web page, locate the form elements and fill in the data line by line, using WebDriverWait to process dynamic loading content, add exception processing and delay to ensure stability, and finally submit the form and process all data lines in a loop.

What is sentiment analysis in cryptocurrency trading? What is sentiment analysis in cryptocurrency trading? Aug 14, 2025 am 11:15 AM

Table of Contents What is sentiment analysis in cryptocurrency trading? Why sentiment analysis is important in cryptocurrency investment Key sources of emotion data a. Social media platform b. News media c. Tools for sentiment analysis and technology Commonly used tools in sentiment analysis: Techniques adopted: Integrate sentiment analysis into trading strategies How traders use it: Strategy example: Assuming BTC trading scenario scenario setting: Emotional signal: Trader interpretation: Decision: Results: Limitations and risks of sentiment analysis Using emotions for smarter cryptocurrency trading Understanding market sentiment is becoming increasingly important in cryptocurrency trading. A recent 2025 study by Hamid

How to handle large datasets in Python that don't fit into memory? How to handle large datasets in Python that don't fit into memory? Aug 14, 2025 pm 01:00 PM

When processing large data sets that exceed memory in Python, they cannot be loaded into RAM at one time. Instead, strategies such as chunking processing, disk storage or streaming should be adopted; CSV files can be read in chunks through Pandas' chunksize parameters and processed block by block. Dask can be used to realize parallelization and task scheduling similar to Pandas syntax to support large memory data operations. Write generator functions to read text files line by line to reduce memory usage. Use Parquet columnar storage format combined with PyArrow to efficiently read specific columns or row groups. Use NumPy's memmap to memory map large numerical arrays to access data fragments on demand, or store data in lightweight data such as SQLite or DuckDB.

How to debug your Python code How to debug your Python code Aug 13, 2025 am 12:18 AM

Useprint()statementstocheckvariablevaluesandexecutionflow,addinglabelsandtypesforclarity,andremovethembeforecommitting;2.UsethePythondebugger(pdb)withbreakpoint()topauseexecution,inspectvariables,andstepthroughcodeinteractively;3.Handleexceptionsusin

How to debug Python code in Sublime Text? How to debug Python code in Sublime Text? Aug 14, 2025 pm 04:51 PM

UseSublimeText’sbuildsystemtorunPythonscriptsandcatcherrorsbypressingCtrl Baftersettingthecorrectbuildsystemorcreatingacustomone.2.Insertstrategicprint()statementstocheckvariablevalues,types,andexecutionflow,usinglabelsandrepr()forclarity.3.Installth

How to run Python code in Sublime Text? How to run Python code in Sublime Text? Aug 16, 2025 am 04:58 AM

Make sure that Python is installed and added to the system PATH, run python--version or python3--version verification through the terminal; 2. Save the Python file as a .py extension, such as hello.py; 3. Create a custom build system in SublimeText, Windows users use {"cmd":["python","-u","$file"]}, macOS/Linux users use {"cmd":["python3

How to flatten a nested list or a list of lists in Python How to flatten a nested list or a list of lists in Python Aug 12, 2025 am 09:49 AM

FlatteninganestedlistinPythonconvertsalistwithsublistsintoasingleflatlist,andthebestmethoddependsonthenestingdepthanddatasize.Forone-levelnesting,uselistcomprehensionlike[itemforsublistinnested_listforiteminsublist]oritertools.chain.from_iterable(nes

How to debug a Python script in VSCode How to debug a Python script in VSCode Aug 16, 2025 am 02:53 AM

To debug Python scripts, you need to first install the Python extension and configure the interpreter, then create a launch.json file to set the debugging configuration, then set a breakpoint in the code and press F5 to start the debugging. The script will be paused at the breakpoint, allowing checking variables and step-by-step execution. Finally, by checking the problem by viewing the console output, adding logs or adjusting parameters, etc., to ensure that the debugging process is simple and efficient after the environment is correct.

See all articles