Equivalent way to implement DENSE_RANK functionality in Pandas
Pandas often needs to rank values within groups, and duplicate values are considered to have the same ranking. To do this, we can make use of the pd.Series.rank
function, specifically using the 'dense'
method.
Take the following data frame as an example:
年份 | 数值 |
---|---|
2012 | 10 |
2013 | 20 |
2013 | 25 |
2014 | 30 |
Our goal is to create a new column called "Rank" that assigns a dense ranking based on the "Year" column, resulting in the following:
年份 | 数值 | 排名 |
---|---|---|
2012 | 10 | 1 |
2013 | 20 | 2 |
2013 | 25 | 2 |
2014 | 30 | 3 |
For this we can use the following code:
<code class="language-python">df['排名'] = df.年份.rank(method='dense').astype(int)</code>
pd.Series.rank
The function calculates the ranking of each element in the Series. By specifying 'dense'
as the method, we instruct it to assign the same rank to elements with the same value. Finally, we use .astype(int)
to convert the result to an integer data type.
The output of the code will generate a new "Rank" column in the data frame, as shown below:
<code> 年份 数值 排名 0 2012 10 1 1 2013 20 2 2 2013 25 2 3 2014 30 3</code>
The above is the detailed content of How to Implement DENSE_RANK Functionality in Pandas?. For more information, please follow other related articles on the PHP Chinese website!