Replacing NaN Values with Column Averages in Pandas DataFrames
When working with pandas DataFrames, encountering NaN (missing) values is common. To effectively handle these values, it is crucial to replace them with appropriate values. One efficient way is to replace NaN values with the average of their respective columns.
Solution Using DataFrame.fillna
Unlike the approach mentioned in the referenced question, pandas DataFrames can be handled differently. The DataFrame.fillna method provides a straightforward solution for filling NaN values:
<code class="python">df.fillna(df.mean())</code>
Detailed Explanation:
Example:
Let's consider the following DataFrame:
A B C 0 -0.166919 0.979728 -0.632955 1 -0.297953 -0.912674 -1.365463 2 -0.120211 -0.540679 -0.680481 3 NaN -2.027325 1.533582 4 NaN NaN 0.461821 5 -0.788073 NaN NaN 6 -0.916080 -0.612343 NaN 7 -0.887858 1.033826 NaN 8 1.948430 1.025011 -2.982224 9 0.019698 -0.795876 -0.046431
After applying the fillna method with averages:
A B C 0 -0.166919 0.979728 -0.632955 1 -0.297953 -0.912674 -1.365463 2 -0.120211 -0.540679 -0.680481 3 -0.151121 -2.027325 1.533582 4 -0.151121 -0.231291 0.461821 5 -0.788073 -0.231291 -0.530307 6 -0.916080 -0.612343 -0.530307 7 -0.887858 1.033826 -0.530307 8 1.948430 1.025011 -2.982224 9 0.019698 -0.795876 -0.046431
As demonstrated, the NaN values have been replaced with the corresponding column averages.
The above is the detailed content of How to Replace Missing Values in Pandas DataFrames with Column Averages?. For more information, please follow other related articles on the PHP Chinese website!