Home > Backend Development > Python Tutorial > How to Pivot a Pandas DataFrame: A Comprehensive Guide to Reshaping Data?

How to Pivot a Pandas DataFrame: A Comprehensive Guide to Reshaping Data?

DDD
Release: 2024-12-25 10:25:09
Original
217 people have browsed it

How to Pivot a Pandas DataFrame: A Comprehensive Guide to Reshaping Data?

How can I pivot a dataframe?

What is pivot?

  • Reshaping a DataFrame from long to wide format
  • Allows for creating a new DataFrame where values are aggregated based on one or more columns

How do I pivot?

  • Several methods to pivot a DataFrame:

    • pd.DataFrame.pivot_table
    • pd.DataFrame.groupby pd.DataFrame.unstack
    • pd.DataFrame.set_index pd.DataFrame.unstack
    • pd.DataFrame.pivot (less flexible)
    • pd.crosstab (for cross tabulation)
    • pd.factorize np.bincount (advanced, high performance)
    • pd.get_dummies pd.DataFrame.dot (cross tabulation)

Long format to wide format?

  • Long format:

    • Each observation occupies one row
    • Multiple columns representing different attributes/measurements
  • Wide format:

    • Each observation occupies one column
    • Multiple rows representing different attributes/measurements

Examples

Question 1: Why do I get ValueError: Index contains duplicate entries, cannot reshape?

  • This occurs when attempting to pivot a DataFrame with duplicate keys on which it is being pivoted
  • Example: If df has duplicate entries for row and col and you pivot with df.pivot(index='row', columns='col'), you will get the error.

Question 2: How do I pivot df such that the col values are columns, row values are the index, and mean of val0 are the values?

  • Use pd.DataFrame.pivot_table:

    df.pivot_table(values='val0', index='row', columns='col', aggfunc='mean')
    Copy after login
    Copy after login

Question 3: How do I make it so that missing values are 0?

  • Use fill_value argument in pd.DataFrame.pivot_table:

    df.pivot_table(values='val0', index='row', columns='col', fill_value=0, aggfunc='mean')
    Copy after login
    Copy after login

Question 4: Can I get something other than mean, like maybe sum?

  • Use a different aggfunc argument in pd.DataFrame.pivot_table:

    df.pivot_table(values='val0', index='row', columns='col', fill_value=0, aggfunc='sum')
    Copy after login
    Copy after login

Question 5: Can I do more than one aggregation at a time?

  • Provide a list of callables to the aggfunc argument in pd.DataFrame.pivot_table:

    df.pivot_table(values='val0', index='row', columns='col', fill_value=0, aggfunc=[np.size, np.mean])
    Copy after login

Question 6: Can I aggregate over multiple value columns?

  • Pass multiple column names as a list to values in pd.DataFrame.pivot_table:

    df.pivot_table(values=['val0', 'val1'], index='row', columns='col', fill_value=0, aggfunc='mean')
    Copy after login

Question 7: Can I subdivide by multiple columns?

  • Pass multiple column names as a list to index or columns in pd.DataFrame.pivot_table:

    df.pivot_table(values='val0', index=['row', 'item'], columns='col', fill_value=0, aggfunc='mean')
    Copy after login

Question 8: Or

  • Can subdivide by multiple columns in index and columns using pd.DataFrame.pivot_table:

    df.pivot_table(values='val0', index=['key', 'row'], columns=['item', 'col'], fill_value=0, aggfunc='mean')
    Copy after login

Question 9: Can I aggregate the frequency in which the column and rows occur together, aka "cross tabulation"?

  • Use pd.crosstab:

    df.pivot_table(values='val0', index='row', columns='col', aggfunc='mean')
    Copy after login
    Copy after login

Question 10: How do I convert a DataFrame from long to wide by pivoting on ONLY two columns?

df.pivot_table(values='val0', index='row', columns='col', fill_value=0, aggfunc='mean')
Copy after login
Copy after login

Question 11: How do I flatten the multiple index to single index after pivot?

  • Join the multi-part index as a single string:

    df.pivot_table(values='val0', index='row', columns='col', fill_value=0, aggfunc='sum')
    Copy after login
    Copy after login

The above is the detailed content of How to Pivot a Pandas DataFrame: A Comprehensive Guide to Reshaping Data?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template