Reshaping tabular data is an essential task in data analysis. Pivoting, a technique for transposing rows and columns in a dataframe, is often useful for creating pivot tables and exploring data from different perspectives. Let's explore how to perform this operation in Pandas, a powerful data manipulation library.
To pivot a dataframe, primarily use the .pivot method. This method takes several arguments:
For example, consider the following dataframe:
Indicator Country Year Value 1 Angola 2005 6 2 Angola 2005 13 3 Angola 2005 10 4 Angola 2005 11 5 Angola 2005 5 1 Angola 2006 3 2 Angola 2006 2 3 Angola 2006 7 4 Angola 2006 3 5 Angola 2006 6
To pivot this dataframe so that the values in the Indicator column become the new columns, use the following code:
out = df.pivot(index=['Country', 'Year'], columns='Indicator', values='Value') print(out)
This operation will produce the following pivoted dataframe:
Indicator 1 2 3 4 5 Country Year Angola 2005 6 13 10 11 5 2006 3 2 7 3 6
To convert the pivoted dataframe back to a flat table, use .rename_axis to remove the Indicator axis and .reset_index to convert Country and Year back to normal columns.
print(out.rename_axis(columns=None).reset_index())
This will result in the original dataframe structure:
Country Year 1 2 3 4 5 0 Angola 2005 6 13 10 11 5 1 Angola 2006 3 2 7 3 6
If your data contains duplicate combinations of labels (e.g., Country, Year, Indicator), use .pivot_table. This method takes the mean by default.
out = df.pivot_table( index=['Country', 'Year'], columns='Indicator', values='Value') print(out.rename_axis(columns=None).reset_index())
This will output a similar pivoted dataframe, but with mean values for duplicate combinations.
For a more detailed overview, refer to the Pandas user guide on Reshaping and pivot tables.
The above is the detailed content of How to Pivot a Dataframe Using Pandas?. For more information, please follow other related articles on the PHP Chinese website!