Home >Operation and Maintenance >Linux Operation and Maintenance >How to advance Linux operation and maintenance from beginner to advanced
Operation and maintenance engineers have a very hard job in the early stage. During this period, they may be doing the work of repairing computers, plugging network cables, and moving machines, which seems to have no status! Time It is also very fragmented, with all kinds of trivial matters surrounding you, making it difficult to reflect your personal value. Gradually, you become confused about the industry and feel that there is no future for development.
These boring and boring jobs will indeed make people hungry. From a technical perspective, these are actually basic skills, which will invisibly help the later operation and maintenance work, because I have also been through this, and I can deeply understand it. arrive. Therefore, you must maintain a positive attitude and continue to learn during this period. One day in the future, I believe it will be rewarded to you!
Okay, let’s get to the point. Based on my many years of operation and maintenance work experience, I will share with you the learning route of a senior operation and maintenance engineer.
At the beginning, you need to be familiar with Linux/Windows operating system installation. Directory structure, startup process, etc.
Mainly learn the Linux system. In the production environment, work is basically done on the character interface, so you must master dozens of commonly used basic management commands. Including user management, disk partitioning, software package management, file permissions, text processing, process management, performance analysis tools, etc.
You must be familiar with the OSI and TCP/IP models. You need to know the basic concepts and implementation principles of switches and routers.
Master the basic grammatical structure of Shell and be able to write simple scripts.
The most commonly used network services must be deployed, such as vsftp, nfs, samba, bind , dhcp, etc.
A code version management system is indispensable. You can learn the mainstream SVN and GIT and be able to deploy and use it simply.
Frequently transfer data between servers, so you must be able to use: rsync and scp.
Data synchronization: inotify/sersync.
To complete some tasks repeatedly, you can write a script to run regularly, so you must be able to configure the scheduled task service crond under Linux.
Every company will basically have a website. To make the website run, you need to build a Web service platform.
If you develop in PHP language, you usually build LAMP and LNMP website platforms. This is a combination of technical terms. Speaking separately, you must be able to deploy Apache, Nginx, MySQL and PHP.
If it is developed in JAVA language, Tomcat is usually used to run the project. In order to improve the access speed, you can use Nginx reverse proxy Tomcat. Nginx processes static pages and Tomcat processes dynamic pages to achieve dynamic and static separation.
It’s not as simple as deployment. You also need to know how the HTTP protocol works and simple performance tuning.
The database is MySQL, which is the most widely used open source database in the world. You’re right to learn it! You also need to know some simple SQL statements, user management, common storage engines, database backup and recovery.
If you want to go deeper, you must know master-slave replication, performance optimization, mainstream cluster solutions: MHA, MGR, etc. Of course, NoSQL is so popular. Just learn Redis and MongoDB.
Security is very important. Don’t wait until the system is invaded before you make security policies. It is already too late! Therefore, when a server comes online Security access control strategies should be implemented immediately, such as using iptables to restrict access to only trusted source IPs, closing some useless services and ports, etc.
You must know some common attack types, otherwise how can you prescribe the right medicine! For example, CC, DDOS, ARP, etc.
Monitoring is essential and is a life-saving straw for timely discovery and tracing of problems. You can choose to learn the mainstream Zabbix open source monitoring system, which is rich in functions and can meet basic monitoring needs. Monitoring points include basic server resources, interface status, service performance, PV/UV, logs, etc.
You can also create a dashboard to display several real-time key data, such as Grafana, which will be very cool.
Shell script is a powerful tool for Linux to automatically complete work. It must be written proficiently, so you must further learn functions, arrays, signals, Send emails etc.
The three musketeers of text processing (grep, sed, awk) have to play 6. Text processing under Linux depends on them.
Shell scripts can only complete some basic tasks. If you want to complete more complex tasks, such as calling APIs, multi-processes, etc. You need to learn a high-level language.
Python is the most used language in the field of operation and maintenance. It is simple and easy to use. You are right to learn it! At this stage, you only need to master the basics, such as basic syntax structures, file object operations, functions, iteration objects, and exception handling. , sending emails, database programming, etc.
Users always complain that accessing the website is slow. See if the server resources are still very low. Rich! Slow website access may not be caused by saturation of server resources. There are many influencing factors, such as network, number of forwarding layers, etc.
Regarding the network, there is a north-south communication problem, and access between the two will be slow. This can be solved by using CDN, while caching static pages, intercepting requests at the top layer as much as possible, and reducing back-end request and response times.
If you do not use a CDN, you can also use caching services such as Squid, Varnish, and Nginx to cache static pages and place them at the traffic entrance.
After all, a single server has limited resources and cannot sustain high traffic. The most critical technology to solve this problem is to use a load balancer. , horizontally expand multiple web servers and provide services to the outside world at the same time, thus doubling the performance. The mainstream open source technologies for load balancers include LVS, HAProxy and Nginx. Be sure to familiarize yourself with one or two!
The performance bottleneck of the Web server has been solved. The database is more critical. It is better to use a cluster. Take the MySQL I learned as an example. It can be a master and multiple slaves architecture. On this basis, reading and writing are separated. The master is responsible for writing and multiple slaves are responsible for reading. , the slave library can be expanded horizontally, and there is a four-layer load balancer in front of it, which can carry tens of millions of PVs. It’s perfect!
High-availability software is also required to avoid single-point tools. The mainstream ones include Keepalived, Heartbeat et al.
Why are there so many pictures on the website! NFS shared storage cannot support it, and the processing is very slow. It’s easy to do! It uses a distributed file system, parallel processing tasks, no single point, high reliability, high performance and other features, mainstream Some include FastDFS, MFS, HDFS, Ceph, GFS, etc. In the early stages, I recommend learning FastDFS, which can meet the needs of small and medium-sized projects.
Hardware server resource utilization is very low, which is a waste! You can virtualize a server with a lot of idle time into many virtual machines , each virtual machine is a complete operating system. Resource utilization can be greatly improved. It is recommended to learn the open source KVM OpenStack cloud platform.
The virtual machine is okay as a basic platform, but the elastic scaling of the application business is too heavy! It takes several minutes to start, and the file is so large, it is too laborious to expand quickly!
Easy to say, use the container , the main features of containers are rapid deployment and environment isolation. A service is encapsulated into an image, and hundreds of containers can be created in minutes.
The mainstream container technology is none other than Docker.
Of course, stand-alone Docker in the production environment cannot meet business needs in most cases. Kubernetes and Swarm cluster management containers can be deployed to form a large resource pool and centralized management to provide strong support for the infrastructure. .
Repetitive work not only fails to improve efficiency, but also fails to reflect value.
All operation and maintenance work is standardized, such as unifying environment versions, directory structures, operating systems, etc. On the basis of standardization, more aspects of automation can be achieved. A complex task can be completed with a click of the mouse or a few commands. It is so refreshing!
Therefore, all operations should be automated as much as possible to reduce human errors. ,Improve work efficiency.
Mainstream server centralized management tools: Ansible, Saltstack
Just choose any one of these two.
Continuous integration tool: Jenkins
5. Advanced Python development
You can learn more about Python development and master object-oriented programming.
It is best to learn a Web framework development website, such as Django and Flask, mainly to develop an operation and maintenance management system. Write some complex processes into the platform, and then integrate centralized management tools to create a unique Operate and maintain your own management platform.
Logs are also very important. Regular analysis can discover potential risks and extract valuable things.
An open source log system: ELK
Learn to deploy and use it to provide log viewing requirements for development.
Only deployment is not enough. Performance optimization can maximize the service capacity.
This area is also relatively difficult, and it is also one of the key points for high salary. You have to work hard to learn for money!
You can start from the hardware layer, operating system layer, software layer and architecture Think in layer dimensions.
At each stage, set a goal.
For example: first set a small goal that can be achieved and earn 100 million!
Learn to share, the value of technology lies in It can effectively transfer knowledge to the outside world and let more people know it.
As long as everyone brings something to the table, think about what will happen?
If you go in the right direction, you won’t have to worry about the long road ahead!
The GNU Project (also known as the Genu Project) was created by Richard Stallman ( The Free Software Collective was launched publicly by Richard Stallman on September 27, 1983. Its goal is to create a completely free operating system. GNU is also known as the Free Software Engineering Project.
GPL is the GNU General Public License (GPL), which is the concept of "anti-copyright" and is one of the GNU protocols. The purpose is to protect the free use, copying, research, and development of GNU software. Modify and publish. It is also required that the software must be released in the form of source code.
The GNU system and the Linux kernel are combined to form a complete operating system: a GNU system based on Linux. This operating system is usually called "GNU/Linux", or simply Linux.
A typical Linux distribution includes: Linux kernel, some GNU program libraries and tools, command line shell, graphical interface X Window system and corresponding desktop environment, such as KDE or GNOME, and contains thousands of applications ranging from office suites, compilers, and text editors to scientific tools.
Mainstream distributions:
Red Hat Enterprise Linux, CentOS, SUSE, Ubuntu, Debian, Fedora, Gentoo
Linux is based on Unix and belongs to the Unix class. The Uinx operating system supports multi-user, multi-tasking, multi-threading and operating systems that support multiple CPU architectures. Linux inherits the network-centric design philosophy of Unix and is a multi-user network operating system with stable performance.
Swap partition, that is, the swap area, the system swaps with Swap when the physical memory is not enough. That is, when the physical memory of the system is not enough, a part of the space in the hard disk is released for use by the currently running program. When those programs are to be run, the saved data is restored from the Swap partition to the memory. Programs whose memory space is released are generally programs that have not operated for a long time.
Swap space should generally be greater than or equal to the size of physical memory. At the same time, the minimum should not be less than 64M, and the maximum should be twice the physical memory.
GNU GRUB (GRand Unified Bootloader referred to as "GRUB") is a multi-operating system boot management program from the GNU project.
GRUB is a boot manager that supports multiple operating systems. In a computer with multiple operating systems, GRUB can be used to select the operating system that the user wants to run when the computer starts. At the same time, GRUB can boot different kernels on the Linux system partition, and can also be used to pass startup parameters to the kernel, such as entering single-user mode.
Cache (cache) is a temporary memory located between the CPU and the memory. The cache capacity is much smaller than the memory but the exchange speed is faster than the memory. Much faster. Cache solves the conflict between CPU operation speed and memory read and write speed by caching file data blocks, and improves the data exchange speed between CPU and memory. The larger the Cache cache, the faster the CPU processing speed.
Buffer (buffer) cache memory speeds up access to data on the disk, reduces I/O, and improves memory and hard disk (or other I/O devices) by caching disk (I/O device) data blocks. ). The Buffer is about to be written to the disk, and the Cache is read from the disk.
(1) The requesting end sends a SYN (SYN=A) data packet and waits for confirmation from the responding end
(2) The responder receives SYN and returns SYN(A 1) and its own ACK(K) packet to the requester
(3 ) The requesting end receives the SYN ACK packet from the responding end, and sends an acknowledgment packet ACK (K 1) to the responding end again.
The requesting end and the responding end establish a TCP connection, complete the three-way handshake, and start data transmission.
The Linux file system adopts a tree directory structure with links, that is, there is only one root directory (usually represented by "/"). It contains information about lower-level subdirectories or files; subdirectories can also contain information about lower-level subdirectories or files.
/: The root of the first hierarchical structure, the root directory of the entire file system hierarchy. That is, the entrance to the file system, the highest level directory.
/boot: Contains the files required for the Linux kernel and system boot program, such as kernel and initrd; the grub system boot manager is also in this directory.
/bin: The commands required by the basic system have similar functions to "/usr/bin". The files in this directory are all executable. Ordinary users can also execute them.
/sbin: Basic system maintenance commands, which can only be used by super users.
/etc: All system configuration files.
/dev: Device file storage directory. Such as terminal, disk, optical drive, etc.
/var: Stores frequently changing data, such as logs, emails, etc.
/home: The default storage directory for ordinary users.
/opt: The storage directory for third-party software. For example, user-defined software packages and compiled software packages are installed in this directory.
/lib: Directory where library files and kernel modules are stored, including all shared library files required by system programs.
Hard Link: Hard links use the same index node (inode number) link, that is, multiple file names can be allowed to point to the same file index node (hard links do not support directory links and cannot cross partition links). Deleting a hard link will not affect the source file of the index node and multiple files under it. Hard link.
Soft link (Symbolic Link): A symbolic link is a link created in the form of a path, similar to a Windows shortcut Links and symbolic links allow the creation of multiple file names linked to the same source file. If the source file is deleted, all soft links under it will be unavailable. (Soft links support directories, cross-partitions, and cross-file systems)
Disk array (Redundant Arrays of independent Disks, RAID), cheap redundant (independent) disk array.
RAID is a technology that combines multiple independent physical hard disks in different ways to form a hard disk group (logical hard disk), providing higher storage performance and data backup than a single hard disk. RAID technology can combine multiple disks together as a logical volume to provide disk spanning functions; it can divide data into multiple data blocks (Blocks) and write/read multiple disks in parallel to increase the speed of disk access; it can Mirroring or verifying operations provide fault tolerance. Specific functions are implemented in different RAID combinations.
From the user's perspective, the disk group composed of RAID is like a hard disk, which can be partitioned, formatted and other operations. The storage speed of RAID is much higher than that of a single hard disk, and it can provide automatic data backup and good fault tolerance.
RAID 0: called Stripping stripe storage technology, all disks are completely read and written in parallel , is the simplest form of setting up a disk array. It only requires more than 2 hard disks. It is low-cost and can provide the performance and throughput of the entire disk. However, RAID 0 does not provide data redundancy and error repair functions, so a single Damage to the hard drive will result in loss of all data. (RAID 0 simply improves disk capacity and performance, without providing reliability guarantee for data, and is suitable for environments that do not require high data security)
RAID 1: Mirror storage, by combining two disks The data of one disk is mirrored to another disk to achieve data redundancy, and mutual backup data is generated on the two disks, and its capacity is only equal to the capacity of one disk. When data is written to one disk, a mirror will be produced on another idle disk to maximize the reliability and repairability of the system without affecting performance; when the original data is busy, it can be copied directly from the mirror Read data (from the faster of the two hard drives) to improve read performance. On the contrary, RAID 1 has slower writing speed. RAID 1 generally supports "hot swap", that is, the removal or replacement of hard disks in the array can be performed while the system is running without interrupting the system. RAID 1 has the highest unit cost of hard disks in the disk array, but it provides high data security, reliability and availability. When a hard disk fails, the system can automatically switch to the mirror disk for reading and writing without the need to reorganize the failure. The data.
RAID 0 1: Also known as RAID 10, it is actually a combination of RAID 0 and RAID 1, which continuously splits data in units of bits or bytes and reads/writes multiple disks in parallel. At the same time, each disk is mirrored for redundancy. Through the combination of RAID 0 and 1, in addition to data being distributed on multiple disks, each disk has its own physical mirror disk, providing redundancy capabilities, allowing one or more disk failures without affecting data availability, and has fast read/write speeds. Writing ability. RAID 0 1 requires at least 4 hard drives to create a stripe set in the disk image. RAID 0 1 technology not only ensures high data reliability, but also ensures high efficiency of data reading/writing.
RAID 5: It is a storage solution that takes into account storage performance, data security and storage cost. RAID 5 can be understood as a compromise between RAID 0 and RAID 1. RAID 5 requires at least three hard drives. RAID 5 can provide data security for the system, but the degree of protection is lower than mirroring and the disk space utilization is higher than mirroring. RAID 5 has a data reading speed similar to RAID 0, but has an additional parity check information, and the writing speed of data is slightly slower than writing to a single disk. At the same time, because multiple data correspond to one parity information, RAID 5 has a higher disk space utilization than RAID 1, and the storage cost is relatively low. It is a solution that is currently widely used.
For more Linux-related technical articles, please visit the Linux Tutorial column to learn!
The above is the detailed content of How to advance Linux operation and maintenance from beginner to advanced. For more information, please follow other related articles on the PHP Chinese website!