Sorting a Pandas Dataframe by Multiple Columns
Sorting a Pandas dataframe by multiple columns is a common operation in data analysis. Consider a dataframe with columns 'a', 'b', and 'c'. To sort this dataframe by column 'b' in ascending order and column 'c' in descending order, follow these steps:
Starting from Pandas version 0.17.0, the sort method has been deprecated in favor of sort_values. As of version 0.20.0, sort has been completely removed. However, the arguments and results remain unchanged:
df.sort_values(['a', 'b'], ascending=[True, False])
An equivalent way using the deprecated sort method is:
df.sort(['a', 'b'], ascending=[True, False])
For example, consider a dataframe df1 with random integer values in columns 'a' and 'b':
import pandas as pd import numpy as np df1 = pd.DataFrame(np.random.randint(1, 5, (10, 2)), columns=['a', 'b'])
Sorting this dataframe by 'a' in ascending order and 'b' in descending order gives:
df1.sort(['a', 'b'], ascending=[True, False])
a b 2 1 4 7 1 3 1 1 2 3 1 2 4 3 2 6 4 4 0 4 3 9 4 3 5 4 1 8 4 1
Remember that the sort method is not in-place by default. To update df1 with the sorted values, assign the result of the sort method to df1 or use inplace=True in the method call:
df1 = df1.sort(['a', 'b'], ascending=[True, False])
or
df1.sort(['a', 'b'], ascending=[True, False], inplace=True)
The above is the detailed content of How to Sort a Pandas DataFrame by Multiple Columns in Ascending and Descending Order?. For more information, please follow other related articles on the PHP Chinese website!