Home > Article > Backend Development > How Python uses itertools.groupby() to group records according to fields

How Python uses itertools.groupby() to group records according to fields

不言forward: 2018-10-22 17:17:592507browse

The content of this article is about how Python uses itertools.groupby() to group records according to fields. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

1. Requirements

There is a series of dictionaries or object instances, and we want to group and iterate the data according to a specific field.

2. Solution

itertools.groupby() function is particularly useful when grouping data.

Example:

from operator import itemgetter
from itertools import groupby

rows=[
    {'name':'mark','age':18,'uid':'110'},
    {'name':'miaomiao','age':28,'uid':'160'},
    {'name':'miaomiao2','age':28,'uid':'150'},
    {'name':'xiaohei','age':38,'uid':'130'},
]

#首先根据age排序
rows.sort(key=itemgetter('age'))

for age,items in groupby(rows,key=itemgetter('age')):
    print(age)
    for i in items:
        print(i)

Result:

18
{'name': 'mark', 'age': 18, 'uid': '110'}
28
{'name': 'miaomiao', 'age': 28, 'uid': '160'}
{'name': 'miaomiao2', 'age': 28, 'uid': '150'}
38
{'name': 'xiaohei', 'age': 38, 'uid': '130'}

3. Analysis

Python implementation of one-key multi-value dictionary implementation

The function groupby() scans the sequence to find sequence items with the same value (or the value returned by the function specified by the parameter key) and groups them. groupby() creates an iterator, and each iteration returns a value and a sub_iterator. This iterator can produce all items with that value in the group.

What is important here is to sort the data based on age first. Because groupby() does not sort.

If you simply group the data together based on date and put it into a large data structure to allow random access, then it may be better to use defaultdict() to build a one-key multi-value dictionary:

from collections import defaultdict

rows=[
    {'name':'mark','age':18,'uid':'110'},
    {'name':'miaomiao','age':28,'uid':'160'},
    {'name':'miaomiao2','age':28,'uid':'150'},
    {'name':'xiaohei','age':38,'uid':'130'},
]

rows_by_age=defaultdict(list)
for row in rows:
    rows_by_age[row['age']].append(row)
for a in rows_by_age[28]:
    print(a)

Result:

{'name': 'miaomiao', 'age': 28, 'uid': '160'}
{'name': 'miaomiao2', 'age': 28, 'uid': '150'}

If sorting is not considered, the defaultdict method is generally faster than groupby.

The above is the detailed content of How Python uses itertools.groupby() to group records according to fields. For more information, please follow other related articles on the PHP Chinese website!

Statement：

This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete

Previous article：How to batch modify file extensions in python? How to batch modify file extensionsNext article：How to batch modify file extensions in python? How to batch modify file extensions

See more

How Python uses itertools.groupby() to group records according to fields

Related articles