Master the basic operation method of pandas to read Excel files
In data analysis and processing, Excel files are a common data source, and Pandas is a common data source in Python. The powerful data analysis and processing library can quickly and efficiently read Excel files and perform data cleaning, processing and analysis. This article will introduce the basic operation method of Pandas to read Excel files, and provide specific code examples to facilitate readers to quickly master it.
First you need to install the Pandas library. It can be installed in the command line through the pip command, as shown below:
pip install pandas
The core tool for Pandas to read Excel files is the read_excel() function , it can read one or more tables in Excel and supports files in multiple formats, such as xls and xlsx, etc.
The following is a simple example of reading an Excel file:
import pandas as pd # 读取Excel文件 data = pd.read_excel('data.xlsx') # 打印数据 print(data)
The above code will read the Excel file named "data.xlsx" into a DataFrame object and convert the data print it out.
After reading the Excel file, we can select some required tables and columns for further analysis and processing. Pandas provides a variety of methods for selecting data, such as using table names, column names, or using row and column indexes.
The following is an example of selecting tables and columns:
import pandas as pd # 读取Excel文件 data = pd.read_excel('data.xlsx', sheet_name='Sheet1') # 选择数据 selected_data = data[['Name', 'Age', 'Gender']] # 打印数据 print(selected_data)
The above code will select the table named "Sheet1" in the Excel file, and then select "Name", "Age" in the table " and "Gender" three columns, and print the results.
Filtering data is a common operation in data analysis. Pandas provides a variety of methods to filter data, such as using Boolean indexes or using the query() function. .
The following is an example of filtering data:
import pandas as pd # 读取Excel文件 data = pd.read_excel('data.xlsx', sheet_name='Sheet1') # 过滤数据 filtered_data = data[(data['Age'] > 18) & (data['Gender'] == 'Male')] # 打印数据 print(filtered_data)
The above code will select the table named "Sheet1" in the Excel file, and then select the data whose age is greater than 18 years old and whose gender is male. and print out the results.
Once you select the required data, you can perform various calculations and analysis operations, such as sum, mean, standard deviation, etc. Pandas provides some built-in functions to complete these operations, such as sum(), mean(), std(), etc.
The following is an example of data calculation and analysis:
import pandas as pd # 读取Excel文件 data = pd.read_excel('data.xlsx', sheet_name='Sheet1') # 过滤数据 filtered_data = data[(data['Age'] > 18) & (data['Gender'] == 'Male')] # 计算数据 age_mean = filtered_data['Age'].mean() age_std = filtered_data['Age'].std() # 打印数据 print('Average Age:', age_mean) print('Standard Deviation of Age:', age_std)
The above code will select the table named "Sheet1" in the Excel file, and then select those who are older than 18 years old and whose gender is male data, and calculate the mean and standard deviation of age, and print the results.
This article introduces the basic operation method of Pandas to read Excel files and provides specific code examples. By studying this article, readers can quickly master the basic operations of reading Excel files with Pandas, and perform data cleaning, analysis, and processing in practical applications.
The above is the detailed content of Easy to learn: master the basic operation methods of pandas to read Excel files. For more information, please follow other related articles on the PHP Chinese website!