Applying a Function to Multiple Columns of a Pandas Dataframe
The situation is as follows: a function and a dataframe are defined, and the goal is to apply the function to two specific columns of the dataframe to generate a new column. However, an attempt to use the apply method with the function results in an error.
To address this issue, there are multiple approaches:
Lambda Expression with Column Names
A concise and readable solution is to use a lambda expression within the apply method:
df['col_3'] = df.apply(lambda x: get_sublist(x.col_1, x.col_2), axis=1)
This approach directly utilizes the column names instead of numerical indices, making it less prone to errors.
Example with Example Data
Consider the example data:
df = pd.DataFrame({'ID':['1', '2', '3'], 'col_1': [0, 2, 3], 'col_2':[1, 4, 5]}) mylist = ['a', 'b', 'c', 'd', 'e', 'f']
Running the previous code will generate a new column, col_3, containing the desired result:
ID col_1 col_2 col_3 0 1 0 1 [a, b] 1 2 2 4 [c, d, e] 2 3 3 5 [d, e, f]
Square Brackets for Non-Standard Column Names
If the column names contain spaces or match existing dataframe attributes, square brackets can be used:
df['col_3'] = df.apply(lambda x: f(x['col 1'], x['col 2']), axis=1)
The above is the detailed content of How to Apply a Function to Multiple Pandas Dataframe Columns and Create a New Column?. For more information, please follow other related articles on the PHP Chinese website!