In this tutorial, we will demonstrate how to count the occurrences of combinations of two columns in a Pandas DataFrame and identify the maximum count for each unique value in one of the grouped columns.
Consider the following Pandas DataFrame df:
<code class="python">df = pd.DataFrame([ [1.1, 1.1, 1.1, 2.6, 2.5, 3.4,2.6,2.6,3.4,3.4,2.6,1.1,1.1,3.3], list('AAABBBBABCBDDD'), [1.1, 1.7, 2.5, 2.6, 3.3, 3.8,4.0,4.2,4.3,4.5,4.6,4.7,4.7,4.8], ['x/y/z','x/y','x/y/z/n','x/u','x','x/u/v','x/y/z','x','x/u/v/b','-','x/y','x/y/z','x','x/u/v/w'], ['1','3','3','2','4','2','5','3','6','3','5','1','1','1'] ]).T df.columns = ['col1','col2','col3','col4','col5']</code>
To obtain the count of each unique combination of col5 and col2 in df, we can utilize the groupby function followed by the size method:
<code class="python">df.groupby(['col5', 'col2']).size()</code>
The output will be:
col5 col2 1 A 1 D 3 2 B 2 etc...
To determine the maximum count for each col2 value, we can use the groupby function's size method to calculate the group sizes and then use the groupby on the first level to find the maximum for each unique col2 value:
<code class="python">df.groupby(['col5', 'col2']).size().groupby(level=1).max()</code>
This will produce the output:
col2 A 3 B 2 C 1 D 3 dtype: int64
The above is the detailed content of How to Groupby DataFrame by Two Columns, Count Occurrences, and Find Maximum Count?. For more information, please follow other related articles on the PHP Chinese website!