How to split a dataframe string column into two columns?
In the realm of data analysis, encountering dataframes with string columns that need to be split is a common occurrence. To achieve this feat, two essential questions arise:
The solution lies in the versatile str attribute of a pandas Series, specifically its indexing interface:
df['AB'].str[0] # accesses the first element of each string df['AB'].str[1] # accesses the second element of each string
By leveraging this indexing interface and tuple unpacking, we can create new columns with the split elements:
df['A'], df['B'] = df['AB'].str.split('-').str
Alternatively, Pandas provides a convenient built-in method, str.split(), for splitting strings and automatically returning a Series of lists:
df['AB_split'] = df['AB'].str.split('-')
To expand this list into separate columns, we employ the expand=True parameter:
df[['A', 'B']] = df['AB'].str.split(' ', n=1, expand=True)
For cases with varied split lengths, expand=True handles it gracefully, ensuring consistent column lengths:
df.join(df['AB'].str.split('-', expand=True).rename(columns={0:'A', 1:'B', 2:'C'}))
Through these techniques, Pandas empower you to efficiently split string columns and restructure your dataframe to meet your specific analysis needs.
The above is the detailed content of How Can I Split a Pandas DataFrame String Column into Multiple Columns?. For more information, please follow other related articles on the PHP Chinese website!