I now need to automatically collect data from the article list of a website and the actual content in the list. The id of each article can be obtained in the list, and each article is passed through a unified interface (the parameter brings the article id that is The corresponding json can be obtained) and there is some data that needs to be collected and then analyzed.
Are there any relatively mature frameworks or wheels that can meet my needs? (It needs to be multi-threaded and can run stably 24/7 because the number of collections is huge)
In addition, I would like to ask how to store the collected content (millions to tens of millions). There is some numerical data in the data that needs statistical analysis. Can I use mysql? Or are there other more mature and simple wheels that can be used?