Python counts the number of word occurrences
To do word frequency statistics, using dictionary is undoubtedly the most appropriate data type. The word is used as the key of the dictionary, and the number of times the word appears is used as the value of the dictionary. It is very convenient to record the frequency of each word. The dictionary is much like our phone book, and each name is associated with a phone number.
The following is the specific implementation code, which implements reading words from the importthis.txt file and counting the 5 words with the most occurrences.
# -*- coding:utf-8 -*- import io import re class Counter: def __init__(self, path): """ :param path: 文件路径 """ self.mapping = dict() with io.open(path, encoding="utf-8") as f: data = f.read() words = [s.lower() for s in re.findall("\w+", data)] for word in words: self.mapping[word] = self.mapping.get(word, 0) + 1 def most_common(self, n): assert n > 0, "n should be large than 0" return sorted(self.mapping.items(), key=lambda item: item[1], reverse=True)[:n] if __name__ == '__main__': most_common_5 = Counter("importthis.txt").most_common(5) for item in most_common_5: print(item)
Execution effect:
('is', 10) ('better', 8) ('than', 8) ('the', 6) ('to', 5)
More python tutorials, recommended learning: Python video tutorial
The above is the detailed content of python counts word occurrences. For more information, please follow other related articles on the PHP Chinese website!