Home > Backend Development > Python Tutorial > How to efficiently convert a Pandas DataFrame with missing values into a NumPy array?

How to efficiently convert a Pandas DataFrame with missing values into a NumPy array?

Mary-Kate Olsen
Release: 2024-11-05 02:42:02
Original
707 people have browsed it

How to efficiently convert a Pandas DataFrame with missing values into a NumPy array?

Convert Pandas Dataframe with Missing Values to NumPy Array

The most efficient method to convert a Pandas dataframe with missing values to a NumPy array is through df.to_numpy(). It offers several advantages over older methods like df.values, including:

  • Consistently returns a view of the underlying data to minimize memory consumption.
  • Handles extension types by converting them to appropriate NumPy dtypes.
  • Preserves the original data types unless specified otherwise.

Example:

<code class="python">import pandas as pd
import numpy as np

# Create a DataFrame with missing values
df = pd.DataFrame({'A': [np.nan, np.nan, 0.1, 0.1, 0.1, 0.1],
                   'B': [0.2, np.nan, 0.2, 0.2, np.nan, np.nan],
                   'C': [np.nan, 0.5, 0.5, np.nan, 0.5, np.nan]})

# Convert to a NumPy array with missing values represented as `np.nan`
array = df.to_numpy()

# Result:
# array([[ nan,  0.2,  nan],
#        [ nan,  nan,  0.5],
#        [ 0.1,  0.2,  0.5],
#        [ 0.1,  0.2,  nan],
#        [ 0.1,  nan,  0.5],
#        [ 0.1,  nan,  nan]])</code>
Copy after login

Preserving Dtypes:

While to_numpy doesn't support preserving Dtypes directly, you can use np.rec.fromrecords to achieve this effect.

<code class="python"># Create a DataFrame with mixed data types
df = pd.DataFrame({'A': [1, 2, 3],
                   'B': [4, 5, 6],
                   'C': [7.2, 8.1, 9.3]})

# Convert to a structured array with preserved Dtypes
struct_array = np.rec.fromrecords(
    df.reset_index(),
    names=list(df.columns) + ['index']
)

# Result:
# rec.array([('a', 1, 4, 7.2), ('b', 2, 5, 8.1), ('c', 3, 6, 9.3)],
#           dtype=[('index', '<U1'), ('A', '<i8'), ('B', '<i8'), ('C', '<f8')])</code>
Copy after login

The above is the detailed content of How to efficiently convert a Pandas DataFrame with missing values into a NumPy array?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template