How to Perform SQL \'count(distinct)\' Equivalent in Pandas using \'nunique()\'?-Python Tutorial-php.cn

How to Perform SQL \'count(distinct)\' Equivalent in Pandas using \'nunique()\'?

Barbara Streisand

Release： 2024-10-23 13:28:29

Original

390 people have browsed it

How to Perform SQL 'count(distinct)' Equivalent in Pandas using 'nunique()'?

SQL Query Equivalent in Pandas using 'count(distinct)'

In SQL, counting distinct values in a column can be achieved using the 'count(distinct)' function. For example, to count unique client codes per year month:

<code class="sql">SELECT count(distinct CLIENTCODE) FROM table GROUP BY YEARMONTH;</code>

Copy after login

A similar operation can be performed in Pandas using the 'nunique()' method on a grouped DataFrame. By grouping the data by the 'YEARMONTH' column and then calling 'nunique()' on the 'CLIENTCODE' column, we can obtain the number of unique clients per year month.

<code class="python">table.groupby('YEARMONTH').CLIENTCODE.nunique()</code>

Copy after login

Example:

Consider a DataFrame 'table' containing the following columns:

CLIENTCODE	YEARMONTH
1	201301
1	201301
2	201301
1	201302
2	201302
2	201302
3	201302

Applying the aforementioned code yields:

<code class="python">Out[3]: 
YEARMONTH
201301       2
201302       3</code>

Copy after login

This output matches the expected result, showing the count of unique clients for each year month.

The above is the detailed content of How to Perform SQL \'count(distinct)\' Equivalent in Pandas using \'nunique()\'?. For more information, please follow other related articles on the PHP Chinese website!