#For the full stack, database skills are indispensable, relational database or nosql, memory database or partial disk storage database, object There are many kinds of stored databases or graph databases, but the first essential skill should be MySQL. From the rise of LAMP, to the emergence of Mariadb, and even the arrival of PG, skilled MySQL skills are of great use.
There are many aspects of MySQL database technology. Here we only cover the necessary performance tuning. We recommend bottom-up performance tuning, which mainly includes operating environment, configuration parameters, SQL performance, and system architecture design tuning. .
This is the world of Linux, and the tuning of the MySQL running environment is often completed together with the Linux kernel tuning. Of course, it also has a certain reference role for cloud service RDS.
The overall goal of the IO scheduler is to make the disk head always move in one direction, and then move in the opposite direction. direction, this is exactly the elevator model in real life, so the IO scheduler is also called an elevator, and the corresponding algorithm is also called the elevator algorithm. There are several elevator algorithms for IO scheduling in Linux, one It’s called as (Anticipatory), one is called cfq (Complete Fairness Queueing), one is called deadline, and one is called noop (No Operation).
IO has a greater impact on the database. The default IO scheduling algorithm of Linux is cfq needs to be modified to deadline. If it is an SSD or PCIe-SSD device, it needs to be modified to noop. You can use the following two modification methods.
1. Online dynamic modification will fail after restarting.
echo “deadline” > /sys/block/sda/queue/scheduler
2. Modify /etc/grub.conf and make it permanent.
Modify the /etc/grub.conf configuration file and add a configuration in the kernel line, for example:
elevator=deadline
Mainly focus on the elevator parameter. If you set the kernel, you need to restart the system to take effect.
The NUMA of the new generation architecture is not suitable for running databases. NUMA is for improving memory utilization, but it may cause a CPU to have insufficient memory. The remaining one is not enough, and a swap problem occurs. Therefore, it is generally recommended to turn off or modify NUMA scheduling.
numa=off
2. Modify the /etc/init.d/mysql or mysqld_safe script to set the NUMA scheduling mechanism when starting the mysqld process, such as numactl –interleave=all.
swappiness is a kernel parameter of Linux, used to control the strategy of swapping out physical memory. It allows a percentage value, the minimum is 0 , the maximum is 100, and the default value is 60. What is the impact of this setting value?
Set vm.swappiness to 0 to use swap as little as possible, and 100 to swap inactive memory pages into swap or release cache as much as possible. Inactive memory means memory that is mapped by the program but not used for a "long time". We can use vmstat to see how much inactive memory is in the system.
# vmstat -a 1
It is recommended to set this value to 1. The setting method is as follows. Add a line to the /etc/sysctl.conf file.
vm.swappiness = 1
This is a frequently modified parameter, and high-concurrency programs will modify it.
ulimit -n 51200
2. Modify the configuration file, Effective permanently.
在/etc/security/limits.conf配置文件中增加
* hardnofile 51200 * softnofile 51200
面向session的进程文件描述符的修改稍有不同,在云上的修改也略有差异,可以参见一样的“open too many files”
对于文件系统,如无特殊要求,最好采用ext4.
文件系统挂载参数是在/etc/fstab文件中修改,重启时候生效。
noatime表示不记录访问时间,nodiratime不记录目录的访问时间。
barrier=0,表示关闭barrier功能.
barrier的主要目的是为了保证磁盘写数据的安全性,但是会降低性能。如果有BBU之类的电池备份电源保证控制卡不瞬间掉电,那么这个功能就可以放心大胆的关闭。
my.cnf中的配置参数调优取决于业务,负载或硬件,在慢内存和快磁盘、高并发和写密集型负载情况下,都需要特殊的调整。
query_cache_size
query cache是一个众所周知的瓶颈,甚至在并发并不多时也如此。 最 好是一开始就停用,设置query_cache_size = 0,并利用其他方法加速查询:优化索引、增加拷贝分散负载或者启用额外的缓存(比如memcache或redis)。如果已经启用了query cache并且还没有发现任何问题,query cache可能有用。如果想停用它,那就得小心了。
innodb_buffer_pool_size
缓冲池是数据和索引缓存的地方:这个值越大越好,这能保证你在大多数的读取操作时使用的是内存而不是硬盘。典型的值是5-6GB(8GB内存),20-25GB(32GB内存),100-120GB(128GB内存)。
innodb_log_file_size
redo日志被用于确保写操作快速而可靠并且在崩溃时恢复。从MySQL 5.5之后,崩溃恢复的性能的到了很大提升,可以同时拥有较高的写入性能和崩溃恢复性能。在MySQL 5.6里可以被提高到4GB以上。如果应用程序需要频繁的写入数据,可以一开始就把它这是成4G。
max_connections
max_connection值被设高了(例如1000或更高)之后一个主要缺陷是当服务器运行1000个或更高的活动事务时会变的没有响应。在应用程序里使用连接池或者在MySQL里使用进程池有助于解决这一问题。
back_log
要求 mysql 能有的连接数量。当主要mysql线程在一个很短时间内得到非常多的连接请求,这就起作用,然后主线程花些时间检查连接并且启动一个新线程。back_log指明在mysql暂时停止回答新请求之前的短时间内多少个请求可以被存在堆栈中。只有如果期望在一个短时间内有很多连接,需要增加它,换句话说,该值对到来的tcp/ip连接的侦听队列的大小。
innodb_file_per_table
This setting tells InnoDB whether it needs to store the data and indexes of all tables in a shared table space (innodb_file_per_table = OFF) or place the data of each table in a separate .ibd file (innodb_file_per_table = ON). One file per table allows you to reclaim disk space when dropping, truncating, or rebuilding tables. This is also necessary for some advanced features, such as data compression. But it doesn't bring any performance gain. In MySQL 5.6, the default value of this property is ON.
innodb_flush_log_at_trx_commit
The default value is 1, indicating that InnoDB fully supports the ACID feature. This value is most appropriate when the concern is data security, such as on a master node. But for systems with slow disk (read and write) speeds, it will bring a huge overhead, because each time flushing changes to the redo log requires additional fsyncs. A value of 0 is faster, but some data may be lost in the event of a system crash, so one pass only applies to backup nodes.
innodb_flush_method
This configuration determines how data and logs are written to the hard disk. Generally speaking, if you have a hardware RAID controller and its independent cache uses a write-back mechanism and has battery power failure protection, then it should be set to O_DIRECT; otherwise, in most cases it should be set to fdatasync (default value). sysbench is a great tool to help you decide this option.
innodb_log_buffer_size
This configuration determines the cache allocated for transactions that have not yet been executed. But if the transaction contains binary large objects or large text fields, look at the Innodb_log_waits status variable. If it is not 0, increase innodb_log_buffer_size.
log_bin
If the database server serves as the backup node of the master node, then enabling the binary log is necessary. Even if you only have one server, this is useful if you want to do point-in-time data recovery. Once created, the binary log is saved permanently. If you don't want to run out of disk space, you can use PURGE BINARY LOGS to purge old files, or set expire_logs_days to specify how many days after which the logs will be automatically purged. Binary logging is not without overhead, so it is recommended to turn this option off if you do not need it on a replica node that is not the primary node.
interactive_timeout
The number of seconds the server waits for action on an interactive connection before closing it. An interactive client is defined as one using the client_interactive option to mysql_real_connect(). The default value is 28800, and it is recommended to change it to 7200.
table_open_cache
Every time MySQL opens a table, it will read some data into the table_open_cache cache. When MySQL cannot find the corresponding information in this cache, it will read it from the disk. Assuming that the system has 200 concurrent connections, you need to set this parameter to 200*N (N is the number of file descriptors required for each connection); when table_open_cache is set to a large value, if the system cannot handle so many file descriptors symbol, then the client will fail and the connection cannot be made.
max_allowed_packet
The accepted packet size; it is safe to increase the value of this variable because additional memory will only be allocated when needed. For example, MySQLd allocates more memory only if you issue a long query or MySQLd must return large result rows. The small default value for this variable is a precautionary measure to capture error packets between the client and server and to ensure that memory overflow is not caused by accidental use of large packets
skip_name_resolve
When the client connects to the database server, and when DNS is very slow, establishing a connection will also be very slow. It is therefore recommended to turn off the skip_name_resolve option when starting the server without doing a DNS lookup.
At the application layer, through the cooperation of pt tool and slow query log, you can easily identify the statements of full table scan.
Avoid full table scan
Build an index
Try to avoid returning large amounts of data to the client. If the amount of data is too large, you should consider whether the corresponding requirements are reasonable
Try to avoid large transaction operations and improve system concurrency capabilities
Before using the cursor-based method or the temporary table method, you should first look for a set-based solution to solve the problem. The set-based method is usually more effective. Try to avoid using cursors because they are less efficient.
About the conditions after where
Should be avoided as much as possible in the where clause Use the != or <> operator, otherwise the engine will give up using the index and perform a full table scan.
You should try to avoid using or in the where clause to connect conditions. You can consider using union instead
Be careful with in and not in. Use, for continuous values, if you can use between, do not use in, exists instead of in
Try to avoid expression operations and function operations on fields in the where clause
About data types
Try to use numeric fields. If fields that only contain numerical information try not to design them as character types. This will reduce the efficiency of queries and connections. performance and increased storage overhead.
Use varchar/nvarchar instead of char/nchar as much as possible, because the storage space of variable-length fields is small. For queries, the search efficiency in a relatively small field is obviously higher. some.
It is best not to leave NULL in the database. Use NOT NULL to fill the database as much as possible. Notes, descriptions, comments, etc. can be set to NULL. For others, it is best not to use NULL. .
Do not use select * from t anywhere, replace "*" with a specific field list, and do not return any unused fields.
About temporary tables
Avoid frequently creating and deleting temporary tables to reduce the consumption of system table resources. For one-time events, it is best to use export tables.
When creating a temporary table, if the amount of data inserted at one time is large, you can use select into instead of create table to avoid causing a large number of logs and improve the speed; if the amount of data is not large , in order to alleviate the resources of the system table, you should first create the table and then insert.
If temporary tables are used, when all temporary tables are explicitly deleted at the end, first truncate table and then drop table. This can avoid long-term locking of system tables.
About indexes
You should first consider creating indexes on the columns involved in where and order by.
When using an index field as a condition, if the index is a composite index, then the first field in the index must be used as the condition to ensure that the system uses the index, otherwise The index will not be used, and field order should be consistent with index order whenever possible.
The more indexes, the better. Although the index can improve the efficiency of the corresponding select, it also reduces the efficiency of insert and update, because insert or update may be rebuilt. index, so it depends on the case. It is best not to have more than 7 indexes on a table. If there are too many, you should consider whether it is necessary to build indexes on some columns that are not commonly used.
From the bottom to the application layer, and finally to the architecture layer, however, talking about architecture without business logic is to be a rogue. The database architecture also depends on the business system, and serving the business system stably and flexibly is the key. The direction of architecture tuning is:
Partition and table
Business sub-library
Main Slave synchronization and read-write separation
Data cache
Master-slave hot standby and HA dual-active
…..
The above is the content of MySQL performance tuning. For more related content, please pay attention to the PHP Chinese website (m.sbmmt.com)!