Grouping DataFrame Rows into Lists in Pandas GroupBy
Many datasets contain redundant information across rows. In order to extract meaningful insights, it is often necessary to group rows based on a common attribute. This enables the aggregation and manipulation of data within each group. In this article, we will explore how to group dataframe rows into lists in Pandas groupby.
Consider a dataframe with two columns, 'a' and 'b':
a b A 1 A 2 B 5 B 5 B 4 C 6
The goal is to group the rows by the first column ('a') and create a list of the values in the second column ('b') for each group. The desired output is:
A [1,2] B [5,5,4] C [6]
To achieve this, we can use Pandas' groupby and apply functions. The groupby function groups the rows by the specified column, while the apply function allows us to perform an operation on each group. In this case, we will apply the list function to create a list of values for each group.
df.groupby('a')['b'].apply(list)
This code will return a Series object containing the lists of values for each group:
a A [1, 2] B [5, 5, 4] C [6] Name: b, dtype: object
To create a new dataframe with the grouped lists, we can use the reset_index function to convert the Series object into a new dataframe and rename the column containing the lists:
df1 = df.groupby('a')['b'].apply(list).reset_index(name='new')
The resulting dataframe will look like this:
a new 0 A [1, 2] 1 B [5, 5, 4] 2 C [6]
The above is the detailed content of How Can I Group DataFrame Rows into Lists Using Pandas Groupby?. For more information, please follow other related articles on the PHP Chinese website!