## How to Efficiently Calculate Frequency Counts for Distinct Values in NumPy Arrays?-Python Tutorial-php.cn

## How to Efficiently Calculate Frequency Counts for Distinct Values in NumPy Arrays?

Susan Sarandon

Release： 2024-10-27 10:55:30

Original

347 people have browsed it

## How to Efficiently Calculate Frequency Counts for Distinct Values in NumPy Arrays?

Calculating Frequency Counts for Distinct Values in NumPy Arrays

Finding the frequency of occurrence for individual values within a NumPy array is a common task in data analysis. This article outlines an efficient approach to obtain these frequency counts.

Method:

The primary method for obtaining frequency counts in NumPy is through the np.unique function, specifically by setting return_counts=True. For instance, consider the following array:

<code class="python">x = np.array([1,1,1,2,2,2,5,25,1,1])</code>

Copy after login

To compute the frequency counts of these elements:

<code class="python">import numpy as np

unique, counts = np.unique(x, return_counts=True)

print(np.asarray((unique, counts)).T)</code>

Copy after login

This will output:

[[ 1  5]
 [ 2  3]
 [ 5  1]
 [25  1]]

Copy after login

As you can see, the resulting array contains the unique values (in the first column) and their respective frequencies (in the second column).

Comparison and Performance:

The np.unique method with return_counts=True offers improved performance compared to other approaches, such as scipy.stats.itemfreq. For large arrays, the time taken by np.unique is significantly reduced, as demonstrated in the following benchmark comparison:

<code class="python">x = np.random.random_integers(0,100,1e6)

%timeit unique, counts = np.unique(x, return_counts=True) # 31.5 ms per loop

%timeit scipy.stats.itemfreq(x) # 170 ms per loop</code>

Copy after login

Conclusion:

The np.unique function in NumPy provides an efficient solution for obtaining the frequency counts of unique values in an array. Its performance advantage over alternative methods makes it a preferred choice for large datasets.

The above is the detailed content of ## How to Efficiently Calculate Frequency Counts for Distinct Values in NumPy Arrays?. For more information, please follow other related articles on the PHP Chinese website!