Now I am segmenting words by title, each title has 3 words
I have created a separate tags table to store the divided words, with one record for each word. When reading related articles, I randomly read a tag, and then Searching for the same tag in the tags table used to be OK when there was little data. Now there are more than 100 million pieces of data in the tags table, and it is super slow to read.
The tags table only has 2 fields, an article ID and word segmentation, all of which are indexed. Then partition.
Is there any other way to write related articles?
Currently, 50,000 new data are added every day
The measurement of relevance should have several dimensions:
1. The section the article belongs to, such as entertainment
2. What is the central idea or theme of the article, which needs to be extracted
3. The time is related to the main object (character, event)
An article may have multiple subject objects and may be related across sections