How to Split Pandas DataFrame into Subsets Based on Column Value?

Barbara Streisand
Release: 2024-10-19 22:33:02
Original
659 people have browsed it

How to Split Pandas DataFrame into Subsets Based on Column Value?

Splitting Pandas DataFrame Based on Column Value

In Pandas, a commonly encountered scenario is the need to split a DataFrame into multiple subsets based on the values present in a specific column. This allows for targeted data analysis and manipulation.

To achieve this, we can leverage the power of boolean indexing in Pandas. Let's consider a DataFrame with a column named "Sales" and explore how we can split it into two based on whether the "Sales" value is below or above a threshold value 's'.

Solution:

<code class="python"># Create a DataFrame with a "Sales" column
df = pd.DataFrame({'Sales':[10,20,30,40,50], 'A':[3,4,7,6,1]})
print (df)</code>
Copy after login
   A  Sales
0  3     10
1  4     20
2  7     30
3  6     40
4  1     50
Copy after login
<code class="python"># Split the DataFrame based on "Sales" values
s = 30

df1 = df[df['Sales'] >= s]
print (df1)</code>
Copy after login
   A  Sales
2  7     30
3  6     40
4  1     50
Copy after login
Copy after login

This creates a new DataFrame, df1, that contains the rows where the "Sales" value is greater than or equal to 's'.

<code class="python">df2 = df[df['Sales'] < s]
print (df2)
Copy after login
   A  Sales
0  3     10
1  4     20
Copy after login
Copy after login

df2 comprises the rows where the "Sales" value is less than 's'.

Alternative Approach Using Bitwise Negation:

Instead of using the greater-than or equal to operator, we can also use bitwise negation (~) to invert the mask:

<code class="python">mask = df['Sales'] >= s
df1 = df[mask]
df2 = df[~mask]
print (df1)</code>
Copy after login
   A  Sales
2  7     30
3  6     40
4  1     50
Copy after login
Copy after login
<code class="python">print (df2)</code>
Copy after login
   A  Sales
0  3     10
1  4     20
Copy after login
Copy after login

This approach achieves the same split but allows for a more concise and optimized coding style.

The above is the detailed content of How to Split Pandas DataFrame into Subsets Based on Column Value?. For more information, please follow other related articles on the PHP Chinese website!

source:php
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!