Joining Dataframes Based on Value Ranges
Given two dataframes, df_1 and df_2, it is possible to join them such that the datetime column timestamp in df_1 falls within a specified range defined by the columns start and end in df_2.
To accomplish this task, one approach is to create an interval index from the start and end columns, setting the closed option to both to ensure inclusive boundaries. Using this interval index, we can utilize get_loc to obtain the corresponding event for each timestamp in df_1.
Example:
import pandas as pd # Input dataframes df_1 = pd.DataFrame({ 'timestamp': ['2016-05-14 10:54:33', '2016-05-14 10:54:34', '2016-05-14 10:54:35', '2016-05-14 10:54:36', '2016-05-14 10:54:39'], 'A': [0.020228, 0.057780, 0.098808, 0.158789, 0.038129], 'B': [0.026572, 0.175499, 0.620986, 1.014819, 2.384590] }) df_2 = pd.DataFrame({ 'start': ['2016-05-14 10:54:31', '2016-05-14 10:54:34', '2016-05-14 10:54:38'], 'end': ['2016-05-14 10:54:33', '2016-05-14 10:54:37', '2016-05-14 10:54:42'], 'event': ['E1', 'E2', 'E3'] }) # Create interval index df_2.index = pd.IntervalIndex.from_arrays(df_2['start'], df_2['end'], closed='both') # Join dataframes using get_loc df_1['event'] = df_1['timestamp'].apply(lambda x: df_2.iloc[df_2.index.get_loc(x)]['event']) # Output joined dataframe print(df_1)
Output:
timestamp A B event 0 2016-05-14 10:54:33 0.020228 0.026572 E1 1 2016-05-14 10:54:34 0.057780 0.175499 E2 2 2016-05-14 10:54:35 0.098808 0.620986 E2 3 2016-05-14 10:54:36 0.158789 1.014819 E2 4 2016-05-14 10:54:39 0.038129 2.384590 E3
The above is the detailed content of How to Join Pandas DataFrames Based on Timestamp Ranges?. For more information, please follow other related articles on the PHP Chinese website!