Groupby Groupby and Average in Pandas
Problem:
Given a DataFrame with 'cluster', 'org', and 'time' columns, how can you calculate the average 'time' per 'org' per 'cluster' and then take the average of 'time' for each 'cluster' group?
Expectation:
cluster | mean(time) |
---|---|
1 | 15 |
2 | 54 |
3 | 6 |
Solution:
To achieve the desired result, you can use the following steps:
Groupby ['cluster', 'org'] and Take Mean:
mean_by_cluster_org = df.groupby(['cluster', 'org'], as_index=False).mean()
Groupby ['cluster'] and Calculate Average:
cluster_average = mean_by_cluster_org.groupby('cluster')['time'].mean()
Display Results:
print(cluster_average)
Alternatively, you can also use the following methods to tackle this problem:
Option 1: Groupby Only ['cluster'] and Take Mean:
cluster_only_average = df.groupby('cluster').mean()
Option 2: Groupby ['cluster', 'org'] and Use Mean:
cluster_org_mean = df.groupby(['cluster', 'org']).mean()
Regardless of the approach you choose, the output will provide you with the average 'time' per 'org' for each 'cluster' group and the overall average of 'time' per 'cluster'.
The above is the detailed content of How to Calculate the Average 'Time' per 'Org' per 'Cluster' and Then the Average 'Time' for Each 'Cluster' Group in Pandas?. For more information, please follow other related articles on the PHP Chinese website!