How Can I Fit an Empirical Distribution to a Theoretical One Using SciPy in Python?

Mary-Kate Olsen

Release： 2024-11-24 09:58:11

Original

205 people have browsed it

How Can I Fit an Empirical Distribution to a Theoretical One Using SciPy in Python?

Fitting an Empirical Distribution to a Theoretical One Using Scipy (Python)

In statistics, it is often necessary to fit an empirical distribution, obtained from observed data, to a theoretical distribution that best describes the data. This allows for the calculation of probabilities and other statistical inferences.

Implementation in Python (Scipy)

Scipy provides numerous distribution functions that can be fitted to data. To find the most suitable distribution, the method of least squares is often used to minimize the sum of squared errors (SSE) between the histogram of the data and the histogram of the fitted distribution.

import numpy as np
import scipy.stats as st

# Data points
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Candidate theoretical distributions
distributions = ['norm', 'beta', 'gamma']

# Iterate over distributions and find best fit
best_dist = None
lowest_sse = float('inf')
for dist_name in distributions:
    dist = getattr(st, dist_name)

    # Fit distribution to data
    params = dist.fit(data)

    # Evaluate SSE
    sse = np.sum((np.histogram(data, bins=10, density=True)[0] - dist.pdf(np.linspace(0, 10, 100), *params))**2)

    # Update best distribution if lower SSE found
    if sse < lowest_sse:
        lowest_sse = sse
        best_dist = dist

# Calculate p-value for a given value
value = 5
p_value = best_dist.cdf(value)

Copy after login

Example

In the example above, the empirical distribution of the data is fitted to three different theoretical distributions (normal, beta, and gamma). The gamma distribution is found to have the lowest SSE and is therefore the best fit. The p-value for the value 5 is then calculated as the cumulative distribution function of the gamma distribution evaluated at 5.

The above is the detailed content of How Can I Fit an Empirical Distribution to a Theoretical One Using SciPy in Python?. For more information, please follow other related articles on the PHP Chinese website!