
Related learning recommendations: python tutorial
Pandas.
The full name of Pandas is Python Data Analysis Library, which is ascientific computing tool based on Numpy. Its biggest feature is that it can operate structured data just like operating tables in a database, so it supports many complex and advanced operations and can be considered an enhanced version of Numpy. It can easily construct complete data from a csv or excel table, and supports many table-level batch data calculation interfaces.
Installation usingLike almost all Python packages, pandas can also be installed through pip. If you have installed the Anaconda suite, libraries such as numpy and pandas have been installed automatically. If you have not installed it, it does not matter. We can complete the installation with one line of commands. pip install pandas复制代码
Like Numpy, we usually give it an alias when using pandas. The alias of pandas is pd. Therefore, the convention for using pandas is:
import pandas as pd复制代码If you run this line without an error, it means that your pandas has been installed. There are two other packages that are generally used together with pandas. One of them is also a scientific computing package called Scipy, and the other is a tool package for visualizing data, called Matplotlib. We can also use pip to install these two packages together. In subsequent articles, when these two packages are used, their usage will be briefly introduced.
pip install scipy matplotlib复制代码
Series indexThere are two most commonly used data structures in pandas, one is Series and the other One is a DataFrame. Among them, series is a one-dimensional data structure
, which can be simply understood as a one-dimensional array or a one-dimensional vector. DataFrame is naturally a two-dimensional data structure, which can be understood as a table or a two-dimensional array.Let’s take a look at Series first. There are two main types of data stored in Series. One is an array composed of a set of data, and the other is the index or label of this set of data. We simply create a Series and print it out to understand.
The first column is its index
The values output here are a Numpy array. This is not surprising, because as we said earlier, pandas is a scientific computing library developed based on Numpy. Numpy is its underlying layer. From the printed index information, we can see that this is a Range type index, its range and step size.
The index is a default parameter in the Series construction function. If we do not fill it in, it will generate a Range index for us by default, which is actually the row number of the data. We can also specify the index of the data ourselves. For example, if we add the index parameter to the code just now, we can specify the index ourselves.

When we specify the index of the character type, the result returned by index is no longer RangeIndex but Index. Note that pandas internally distinguishes between numeric indexes and character indexes.
With the index, it is naturally used to find elements. We can directly use the index as the subscript of the array, and the effect of the two is the same. Not only that, index arrays are also acceptable, and we can directly query the values of several indexes.

duplicate indexes are also allowed. Similarly, when we use index queries, we will also get multiple results.


Series calculation
addition, subtraction, multiplication and division operations Perform operations on the entire Series:
You can also use the operation function in Numpy to perform some complex mathematical operations, but the result of this calculation will be a Numpy array.

Because there is an index in the Series, we can also use dict to determine whether the index is in the Series:

Series has indexes and values. In fact, the storage structure is the same as dict, so Seires also supports initialization through a dict:

Through this The order created in this way is the order in which the keys are stored in the dict. We canspecify index when creating, so that we can control its order.

We passed in an additional key that did not appear in the dict when specifying the index. Since the corresponding value cannot be found in the dict, Series will Record it as NAN (Not a number). It can be understood as illegal value or null value. When we process features or training data, we often encounter situations where a certain feature of the data with some entries is blank. We can use pandas The isnull and notnull functions check for vacancies.

Of course, there is also an isnull function in Series, which we can also call.

Finally, the index in the Series can also be modified, we can directly assign a new value to it:

Summary
At its core, Series in pandas isA layer of encapsulation on Numpy one-dimensional array, adding some related functions such as indexing. So we can imagine that DataFrame is actually an encapsulation of a Series array, with more data processing-related functions added. Once we have grasped the core structure, it is much more useful to understand the entire function of pandas than to memorize these APIs one by one.
pandas is a great tool for Python data processing. As a qualified algorithm engineer, it is almost a must-know. It is also the basis for us to use Python for machine learning and deep learning. According to survey data, 70% of the daily work of algorithm engineers is invested in data processing, and less than 30% is actually used to implement and train models. Therefore, we can see the importance of data processing. If you want to develop in the industry, it is not just enough to learn the model. This article uses mdnice for typesetting
php training
The above is the detailed content of Series of data processing using pandas. For more information, please follow other related articles on the PHP Chinese website!
Python and Time: Making the Most of Your Study TimeApr 14, 2025 am 12:02 AMTo maximize the efficiency of learning Python in a limited time, you can use Python's datetime, time, and schedule modules. 1. The datetime module is used to record and plan learning time. 2. The time module helps to set study and rest time. 3. The schedule module automatically arranges weekly learning tasks.
Python: Games, GUIs, and MoreApr 13, 2025 am 12:14 AMPython excels in gaming and GUI development. 1) Game development uses Pygame, providing drawing, audio and other functions, which are suitable for creating 2D games. 2) GUI development can choose Tkinter or PyQt. Tkinter is simple and easy to use, PyQt has rich functions and is suitable for professional development.
Python vs. C : Applications and Use Cases ComparedApr 12, 2025 am 12:01 AMPython is suitable for data science, web development and automation tasks, while C is suitable for system programming, game development and embedded systems. Python is known for its simplicity and powerful ecosystem, while C is known for its high performance and underlying control capabilities.
The 2-Hour Python Plan: A Realistic ApproachApr 11, 2025 am 12:04 AMYou can learn basic programming concepts and skills of Python within 2 hours. 1. Learn variables and data types, 2. Master control flow (conditional statements and loops), 3. Understand the definition and use of functions, 4. Quickly get started with Python programming through simple examples and code snippets.
Python: Exploring Its Primary ApplicationsApr 10, 2025 am 09:41 AMPython is widely used in the fields of web development, data science, machine learning, automation and scripting. 1) In web development, Django and Flask frameworks simplify the development process. 2) In the fields of data science and machine learning, NumPy, Pandas, Scikit-learn and TensorFlow libraries provide strong support. 3) In terms of automation and scripting, Python is suitable for tasks such as automated testing and system management.
How Much Python Can You Learn in 2 Hours?Apr 09, 2025 pm 04:33 PMYou can learn the basics of Python within two hours. 1. Learn variables and data types, 2. Master control structures such as if statements and loops, 3. Understand the definition and use of functions. These will help you start writing simple Python programs.
How to teach computer novice programming basics in project and problem-driven methods within 10 hours?Apr 02, 2025 am 07:18 AMHow to teach computer novice programming basics within 10 hours? If you only have 10 hours to teach computer novice some programming knowledge, what would you choose to teach...
How to avoid being detected by the browser when using Fiddler Everywhere for man-in-the-middle reading?Apr 02, 2025 am 07:15 AMHow to avoid being detected when using FiddlerEverywhere for man-in-the-middle readings When you use FiddlerEverywhere...


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Dreamweaver Mac version
Visual web development tools

SublimeText3 English version
Recommended: Win version, supports code prompts!

Notepad++7.3.1
Easy-to-use and free code editor

Atom editor mac version download
The most popular open source editor

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.







