Select DataFrame Rows Between Two Dates
Introduction
When working with time-series data, it is often necessary to select specific rows based on date ranges. This article explores two methods for achieving this in pandas DataFrames.
Method 1: Boolean Mask
Ensure the date column is a Series with dtype datetime64[ns]:
df['date'] = pd.to_datetime(df['date'])
Create a boolean mask using comparison operators with the start and end dates:
mask = (df['date'] > start_date) & (df['date'] <= end_date)
Select the sub-DataFrame using the mask:
df.loc[mask]
Method 2: DatetimeIndex
Set the date column as the index:
df = df.set_index(['date'])
Slice the DataFrame using date ranges:
df.loc[start_date:end_date]
Example
Consider a DataFrame with a date column. The following code uses the boolean mask method to select rows between '2000-06-01' and '2000-06-10':
import pandas as pd df = pd.DataFrame({ 'date': pd.date_range('2000-1-1', periods=200, freq='D'), 'value': np.random.rand(200) }) mask = (df['date'] > '2000-06-01') & (df['date'] <= '2000-06-10') result_df = df[mask]
The result includes rows from June 1st to 10th, 2000.
Comparison
The above is the detailed content of How to Efficiently Select DataFrame Rows Within a Specific Date Range in Pandas?. For more information, please follow other related articles on the PHP Chinese website!