Home > Backend Development > Python Tutorial > How to Apply Multiple Functions to Multiple Columns in Pandas GroupBy?

How to Apply Multiple Functions to Multiple Columns in Pandas GroupBy?

Barbara Streisand
Release: 2024-12-08 05:53:10
Original
511 people have browsed it

How to Apply Multiple Functions to Multiple Columns in Pandas GroupBy?

How to Apply Multiple Functions to Multiple Grouped Columns

Groupby operations in Pandas allow for the aggregation of data based on specific columns or keys. However, when working with complex datasets, it may be necessary to perform multiple operations on different columns within the grouped data.

Using a Dictionary for Series Group-bys

For a Series groupby object, you can use a dictionary to specify multiple functions and output column names, as shown below:

grouped['D'].agg({'result1' : np.sum,
   .....:                   'result2' : np.mean})
Copy after login

This approach, however, does not work for DataFrame groupby objects, as it expects the dictionary keys to represent column names for applying functions.

Custom Functions with Apply

To address this limitation, you can leverage the apply method, which implicitly passes a DataFrame to the applied function. By defining a custom function and returning a Series or MultiIndex Series, you can perform multiple operations on multiple columns within each group:

Returning a Series:

def f(x):
    d = {}
    d['a_sum'] = x['a'].sum()
    d['a_max'] = x['a'].max()
    d['b_mean'] = x['b'].mean()
    d['c_d_prodsum'] = (x['c'] * x['d']).sum()
    return pd.Series(d, index=['a_sum', 'a_max', 'b_mean', 'c_d_prodsum'])

df.groupby('group').apply(f)
Copy after login

Returning a Series with MultiIndex:

def f_mi(x):
        d = []
        d.append(x['a'].sum())
        d.append(x['a'].max())
        d.append(x['b'].mean())
        d.append((x['c'] * x['d']).sum())
        return pd.Series(d, index=[['a', 'a', 'b', 'c_d'], 
                                   ['sum', 'max', 'mean', 'prodsum']])

df.groupby('group').apply(f_mi)
Copy after login

This approach provides a flexible way to perform complex aggregations on grouped data, allowing for multiple operations on multiple columns within each group.

The above is the detailed content of How to Apply Multiple Functions to Multiple Columns in Pandas GroupBy?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template