Home > Backend Development > Python Tutorial > How to Calculate the Average Time per Organization Within Each Cluster in a Pandas DataFrame?

How to Calculate the Average Time per Organization Within Each Cluster in a Pandas DataFrame?

Susan Sarandon
Release: 2024-11-14 20:49:02
Original
363 people have browsed it

How to Calculate the Average Time per Organization Within Each Cluster in a Pandas DataFrame?

Performing Grouped Aggregation and Average Calculations

Consider the following DataFrame with data on cluster, organization, and time:

  cluster org  time
0       a    8
1       a    6
2       h   34
3       c   23
4       d   74
5       w    6
Copy after login

The objective is to calculate the average time per organization within each cluster. The expected result should resemble:

cluster  mean(time)
1        15 #=((8 + 6) / 2 + 23) / 2
2        54 #=(74 + 34) / 2
3        6
Copy after login

Solution Using Double GroupBy and Mean Calculations:

To achieve this, utilize the power of Pandas' groupby function:

  1. Initial GroupBy: Group the data by both 'cluster' and 'org' using groupby(['cluster', 'org']).
  2. Intermediate Aggregate: Calculate the mean of time within each group using mean().
  3. Secondary GroupBy: Further group the resulting DataFrame by 'cluster' using groupby('cluster').
  4. Final Aggregate: Compute the mean of time for each cluster using mean().
cluster_org_time = df.groupby(['cluster', 'org'], as_index=False).mean()
result = cluster_org_time.groupby('cluster')['time'].mean()
Copy after login

Alternative Solution for Clustered Group Averages:

For the average of cluster groups only, simply group by ['cluster'] and compute the mean using mean().

cluster_mean_time = df.groupby(['cluster']).mean()
Copy after login

Additional Option for GroupBy with org and Mean Calculation:

Alternatively, you can group by ['cluster', 'org'] and directly calculate the mean:

cluster_org_mean_time = df.groupby(['cluster', 'org']).mean()
Copy after login

The above is the detailed content of How to Calculate the Average Time per Organization Within Each Cluster in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template