How to Calculate the Average Time per Organization Within Each Cluster in a Pandas DataFrame?-Python Tutorial-php.cn

How to Calculate the Average Time per Organization Within Each Cluster in a Pandas DataFrame?

Susan Sarandon

Release： 2024-11-14 20:49:02

Original

363 people have browsed it

How to Calculate the Average Time per Organization Within Each Cluster in a Pandas DataFrame?

Performing Grouped Aggregation and Average Calculations

Consider the following DataFrame with data on cluster, organization, and time:

  cluster org  time
0       a    8
1       a    6
2       h   34
3       c   23
4       d   74
5       w    6

Copy after login

The objective is to calculate the average time per organization within each cluster. The expected result should resemble:

cluster  mean(time)
1        15 #=((8 + 6) / 2 + 23) / 2
2        54 #=(74 + 34) / 2
3        6

Copy after login

Solution Using Double GroupBy and Mean Calculations:

To achieve this, utilize the power of Pandas' groupby function:

Initial GroupBy: Group the data by both 'cluster' and 'org' using groupby(['cluster', 'org']).
Intermediate Aggregate: Calculate the mean of time within each group using mean().
Secondary GroupBy: Further group the resulting DataFrame by 'cluster' using groupby('cluster').
Final Aggregate: Compute the mean of time for each cluster using mean().

cluster_org_time = df.groupby(['cluster', 'org'], as_index=False).mean()
result = cluster_org_time.groupby('cluster')['time'].mean()

Copy after login

Alternative Solution for Clustered Group Averages:

For the average of cluster groups only, simply group by ['cluster'] and compute the mean using mean().

cluster_mean_time = df.groupby(['cluster']).mean()

Copy after login

Additional Option for GroupBy with org and Mean Calculation:

Alternatively, you can group by ['cluster', 'org'] and directly calculate the mean:

cluster_org_mean_time = df.groupby(['cluster', 'org']).mean()

Copy after login

The above is the detailed content of How to Calculate the Average Time per Organization Within Each Cluster in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!