In data manipulation tasks, it is often necessary to combine rows from multiple dataframes into a single dataframe. One way to achieve this is by performing a cartesian product, which generates all possible combinations of rows from the input dataframes.
For Pandas versions >= 1.2, the merge function provides a built-in method for cartesian product calculations. The following code demonstrates its usage:
import pandas as pd df1 = pd.DataFrame({'col1':[1,2],'col2':[3,4]}) df2 = pd.DataFrame({'col3':[5,6]}) df1.merge(df2, how='cross')
Output:
col1 col2 col3 0 1 3 5 1 1 3 6 2 2 4 5 3 2 4 6
For Pandas versions < 1.2, an alternative approach using the merge function is available. In this method, a common key is added to each dataframe to facilitate the join:
import pandas as pd df1 = pd.DataFrame({'key':[1,1], 'col1':[1,2],'col2':[3,4]}) df2 = pd.DataFrame({'key':[1,1], 'col3':[5,6]}) pd.merge(df1, df2,on='key')[['col1', 'col2', 'col3']]
Output:
col1 col2 col3 0 1 3 5 1 1 3 6 2 2 4 5 3 2 4 6
The above is the detailed content of How to Perform a Cartesian Product of DataFrames in Pandas?. For more information, please follow other related articles on the PHP Chinese website!