The most comprehensive sharing of MySQL usage specifications in history-Mysql Tutorial-php.cn

The most comprehensive sharing of MySQL usage specifications in history

[Related learning recommendations:mysql tutorial】

Recently, there have been many database-related operations involved, and the company’s existing specifications are not too comprehensive. According to the relevant standards of various experts on the Internet, I have compiled some standard usage for my own use. I hope you can correct me.

Database environment

dev: Development environment

Development can read, write, and modify tables structure. Developers can modify the table structure and the data in it at will but need to ensure that it does not affect other development colleagues.

test: The test environment

is readable and writable by developers, and developers can modify the table structure through tools.

online: Online environment

Developers are not allowed to directly perform database operations in the online environment. If operations are required, they must find the DBA to perform the operations and make corresponding records. This is prohibited. pressure test.

The key issue is that the user permissions corresponding to the MySQL server in each environment must be clearly divided, identifiable, and able to specifically distinguish business scenarios, etc.

Naming convention

Basic naming rules

Use meaningful English vocabulary , words are separated by underscores. (Do not use Pinyin)
can only use English letters, numbers, and underscores, and start with an English letter.
Use all lowercase letters for libraries, tables, and fields, and do not use camel case naming.
Avoid using ORACLE and MySQL reserved words, such as desc, and keywords such as index.
The name cannot exceed 32 characters. The meaning of the name must be understood. It is recommended to use nouns instead of verbs
Databases and data tables must use prefixes
Temporary database and table names must start with tmp is the prefix and date is the suffix
The backup library and table must be bak as the prefix and date is the suffix

Why are all libraries, tables, and fields used? lower case?

In MySQL, databases and tables correspond to directories and files under those directories. Therefore, the sensitivity of the operating system determines the case sensitivity of database and table naming.

It is not case sensitive under Windows.
Case rules under Linux
Database names and table names are strictly case-sensitive;
Table aliases are strictly case-sensitive;
Column names and column aliases are case-insensitive in all cases;
Variable names are also strictly case-sensitive;
How to solve the problem if camel case naming has been set ? You need to add lower_case_table_names = 1 to the MySQL configuration file my.ini.

Table naming

Tables in the same module should use the same prefix as much as possible, and the table names should express the meaning as much as possible. All log tables start with log_

Field naming

An English word or abbreviation that expresses its actual meaning. Fields with Boolean meaning are prefixed by is_, followed by the past participle of the verb.
Fields with the same meaning between tables should have the same name. Fields with the same meaning between tables are named with the table name_field name minus the module prefix.
Foreign key fields use table name_field name to indicate their association.
The primary key of a table is generally agreed to be id, an auto-increment type, and the foreign keys of other tables are expressed in the form of xxx_id.

Index naming

Non-unique index must be named according to "idx_field name_field name[_field name]"
The unique index must be named according to "uniq_field name_field name[_field name]"

Constraint naming

Primary key constraint: pk_table name.
Unique constraint: uk_table name_field name. (The application needs to have uniqueness checking logic at the same time.)

Table design specifications

The table engine depends on the actual application scenario; logs and reports It is recommended to use myisam for class tables, and it is recommended to use the innodb engine for tables related to transactions, audits, and amounts. If there is no explanation, the innodb engine will be used when creating the table

The default use is utf8mb4 character set, and the database sorting rule uses utf8mb4_general_ci. (Since the database definition uses the default, the data table can no longer be defined, but for the sake of insurance, it is recommended Write

Why the character set is not selected utf8, and the sorting rule is not utf8_general_ci

MySQL using utf8 encoding cannot save the placeholder is 4 Bytes of Emoji expressions. In order to make the back-end project fully support the Emoji expressions input by the client, upgrading the encoding to utf8mb4 is the best solution. For the JDBC connection string, the characterEncoding is set to utf8 or the above configuration still cannot be inserted normally. In the case of emoji data, you need to specify the connection character set as utf8mb4 in the code.

All tables and fields should use the comment column attribute to describe the true meaning of the table and field. If it is an enumeration value, it is recommended Define all the content used in this field.

If there is no explanation, the first id field in the table must be the primary key and grow automatically. Data transfer as a context as a condition within a non-transaction is prohibited. It is prohibited to use varchar type as primary key statement design.

If there is no description, the table must contain the create_time and modify_time fields, that is, the table must contain fields that record the creation time and modification time.

If there is no description, the table must contain is_del, which is used to indicate whether the data has been deleted. In principle, physical deletion of database data is not allowed.

Use as little storage space as possible to store data in a field
If you can use int, don’t use char or varchar
If you can use tinyint, don’t use it int
Use UNSIGNED to store non-negative values.
It is not recommended to use ENUM and SET types, use TINYINT instead
Use short data types, for example, when the value range is 0-80, use TINYINT UNSIGNED
Storage accuracy Floating point numbers must use DECIMAL instead of FLOAT and DOUBLE
Time fields, except for special circumstances, use int to record unix_timestamp
Use the YEAR type to store the year.
Use the DATE type to store dates.
It is recommended to use the TIMESTAMP type to store time (accurate to seconds), because TIMESTAMP uses 4 bytes and DATETIME uses 8 bytes.
It is recommended to use INT UNSIGNED to store IPV4.
Do not use TEXT and BLOB types as much as possible
It is prohibited to use VARBINARY and BLOB to store pictures, files, etc. in the database. It is recommended to use other storage methods (TFS/SFS). MySQL only saves pointer information.
The size of a single record is prohibited from exceeding 8k (column length (Chinese)_3 (UTF8) column length (English)_1)

What is the difference between datetime and timestamp ?

Similar points:

The display format of the TIMESTAMP column is the same as that of the DATETIME column. The display width is fixed at 19 characters, and the format is YYYY-MM-DD HH:MM:SS.

Difference:

TIMESTAMP

4 bytes storage, time range: 1970-01-01 08:00:01 ~ 2038-01-19 11:14:07 The value is saved in UTC format, which involves time zone conversion. The current time zone is converted when storing, and then converted back to the current time zone when retrieving.
datetime is stored in 8 bytes, time range: 1000-01-01 00:00:00 ~ 9999-12-31 23:59:59
Actual format storage, independent of time zone

How to use TIMESTAMP's automatic assignment attribute?

Set the current time as the default value of ts: ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP. When the row is updated, update the value of ts: ts TIMESTAMP DEFAULT 0 ON UPDATE CURRENT_TIMESTAMP.

You can combine 1 and 2: ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP.

How to use INT UNSIGNED to store ip?

Use INT UNSIGNED instead of char(15) to store the ipv4 address, and convert it through the MySQL functions inet_ntoa and inet_aton. IPv6 addresses currently do not have a conversion function and need to be stored using DECIMAL or two bigINTs.

If there is no comment, all fields are set to NOT NULL and default values are set;
It is forbidden to store plain text passwords in the database
If there is no comment, all fields Boolean fields, such as is_hot and is_deleted, must set a default value and set it to 0;
If no remarks are made, the sorting field order_id will be arranged in descending order by default in the program;
Plastic definition No length is added, for example, INT is used instead of INT[4]

INT[M]. What does the M value mean?

Note that the number after the numerical type brackets only indicates the width and has nothing to do with the storage range. Many people think that the value ranges of INT(4) and INT(10) are (-9999 to 9999) and (-9999999999 to 9999999999) respectively. This understanding is wrong. In fact, when the M value in the integer is used in combination with the ZEROFILL attribute, the column values can be equal in width. No matter what the value of M in INT[M] is, its value range is still (-2147483648 to 2147483647 when signed), (0 to 4294967295 when unsigned).

Display width does not limit the range of values that can be saved within the column, nor does it limit the display of values that exceed the specified width of the column. When used in conjunction with the optional extended attribute ZEROFILL, the default supplementary spaces are replaced with zeros. For example: For a column declared as INT(5) ZEROFILL, the value 4 is retrieved as 00004. Please note that if you store a value in an integer column that exceeds the display width, MySQL will encounter problems when generating temporary tables for complex joins, because in these cases MySQL believes that the data will fit in the original column width. If you specify ZEROFILL for a numeric column, MySQL Automatically add the UNSIGNED attribute to the column.

Use VARBINARY to store case-sensitive variable-length strings

When to use CHAR and when to use VARCHAR?

CHAR and VARCHAR types are similar, but they are saved and retrieved differently. They also differ in terms of their maximum length and whether trailing spaces are preserved. The length declared for the CHAR and VARCHAR types represents the maximum number of characters you want to save. For example, CHAR(30) can occupy 30 characters.

The length of the CHAR column is fixed to the length declared when creating the table. The length can be any value from 0 to 255. When saving CHAR values, pad them to the right with spaces to the specified length. When a CHAR value is retrieved, trailing spaces are removed. No case conversion is performed during storage or retrieval.

The values in the VARCHAR column are variable-length strings. The length can be specified as a value between 0 and 65,535. (The maximum effective length of VARCHAR is determined by the maximum row size and the character set used. The overall maximum length is 65,532 bytes). Compared with CHAR, the VARCHAR value only saves the required number of characters, plus one byte to record the length (if the declared length of the column exceeds 255, two bytes are used). VARCHAR values are saved without padding. Trailing spaces are retained when the value is saved and retrieved, conforming to standard SQL.

char is suitable for storing the MD5 hash value of the user password, and its length is always the same. char is also better than varchar for values that change frequently, because fixed-length rows are less prone to fragmentation, and char is also more efficient than varchar for very short columns. The char(1) string will only occupy one byte for a single-byte character set, but the varchar(1) string will occupy 2 bytes, because 1 byte is used to store the length information.

Index design specifications

The query speed of MySQL relies on good index design, so indexes are crucial for high performance. Reasonable indexes will speed up queries (including UPDATE and DELETE speeds. MySQL will load the page containing the row into memory and then perform UPDATE or DELETE operations), while unreasonable indexes will slow down the speed. MySQL index search is similar to the pinyin and radical search in Xinhua Dictionary. When the pinyin and radical index does not exist, it can only be searched by turning page by page. When MySQL queries cannot use indexes, MySQL will perform a full table scan, which will consume a lot of IO. Purpose of index: deduplication, speed up positioning, avoid sorting, overwrite index.

What is a covering index

In the InnoDB storage engine, the secondary index (non-primary key index) does not directly store row addresses and stores primary key values. If the user needs to query a data column that is not included in the secondary index, he or she needs to first find the primary key value through the secondary index, and then query the other data columns through the primary key, so it needs to be queried twice. The concept of covering index is that the query can be completed in an index, and the efficiency of covering index will be relatively high. The primary key query is a natural covering index. Reasonable creation of indexes and reasonable use of query statements can improve performance when using covering indexes. For example, SELECT email,uid FROM user_email WHERE uid=xx, if uid is not the primary key, you can add the index as index(uid,email) when appropriate to improve performance.

Basic specifications for indexes

Index number control, the number of indexes in a single table shall not exceed 5, and the number of fields in a single index shall not exceed 5.
Comprehensive evaluation of data density and distribution
Consider the query and update ratio

Why can't there be too many indexes in a table?

InnoDB’s secondary index uses b tree for storage, so b tree needs to be adjusted during UPDATE, DELETE, and INSERT. Excessive indexes will slow down the update speed.

Use prefix index for strings. The length of prefix index should not exceed 8 characters. It is recommended to give priority to prefix index. If necessary, you can add pseudo columns and create indexes.

Do not index blob/text and other fields, do not index large fields, this will make the index take up too much storage space

What is a prefix index?

To put it bluntly, a prefix index indexes the first few characters of the text (specifically, the number of characters specified when creating the index), so that the index created is smaller, so the query is faster. Prefix index can effectively reduce the size of index files and improve indexing speed. But prefix indexes also have their disadvantages: MySQL cannot use prefix indexes in ORDER BY or GROUP BY, nor can they be used as covering indexes.

Syntax for establishing a prefix index: ALTER TABLE table_name ADD KEY(column_name(prefix_length));

Primary key guidelines

The table must have Primary key
Do not use frequently updated columns
Try not to select string columns
Do not use UUID MD5 HASH
Use non-empty unique keys by default
It is recommended to choose auto-increment or issuer

Important SQL must be indexed, core SQL gives priority to covering the index

UPDATE, WHERE condition columns of DELETE statement
ORDER BY, GROUP BY, DISTINCT fields
Multi-table JOIN fields

The greatest differentiation Put the fields at the front

Choose fields with better filtering properties and put them at the front, such as order number, userid, etc. It is generally not recommended to put the fields with filtering properties such as type and status at the front
The index is based on the left prefix principle. When a joint index (a,b,c) is established, the query condition can only be used when it contains (a) or (a,b) or (a,b,c). When using the index, (a, c) can only use the a column index when it is used as a condition, so at this time, it is necessary to make sure that the return columns of a must not be too many, otherwise the statement design will be unreasonable, and (b, c) cannot use the index.
Create a joint index reasonably (to avoid redundancy), (a,b,c) is equivalent to (a), (a,b), (a,b,c)

INDEX TABOO

Do not create indexes on low cardinality columns, such as "gender"
Do not perform mathematical operations and functional operations on indexed columns
Do not index commonly used small tables
Try not to use foreign keys
Foreign keys are used to protect referential integrity and can be implemented on the business end
The operations on the parent table and the child table will affect each other and reduce availability
INNODB's own limitations on online DDL

Limitations on indexes in MYSQL

The sum of the MYISAM storage engine index lengths cannot exceed 1000 bytes
BLOB and TEXT type columns can only create prefix indexes
MYSQL currently does not support function indexes
When using not equal to (!= or <>), MYSQL cannot use the index.
After filtering fields using function operations (such as abs (column)), MYSQL cannot use indexes.
MYSQL cannot use the index when the join condition field types in the join statement are inconsistent
When using the LIKE operation, if the condition starts with a wildcard (such as ‘�c…’), MYSQL cannot use the index.
When using non-equivalent queries, MYSQL cannot use Hash indexes.

Statement design specifications

Use precompiled statements

Only pass parameters, which is more efficient than passing SQL statements
Parse once, use multiple times
Reduce the probability of SQL injection

Avoid implicit conversion

Will cause index failure

Make full use of prefix index

Must be the leftmost prefix
It is impossible to use two range conditions at the same time
Do not use % leading queries, such as like “�”

Do not use negative queries, such as not in/like

Unable to use index, resulting in full table scan
Full table scan leads to reduced buffer pool utilization

Avoid using stored procedures and triggers , UDF, events, etc.

Let the database do what it does best
Reduce business coupling and leave room for sacle out and sharding
Avoid Open BUG

Avoid using JOIN of large tables

What MySQL is best at is the primary key/secondary index query of a single table
JOIN Consume more memory and generate temporary tables

Avoid mathematical operations in the database

MySQL is not good at mathematical operations and logical judgment
Unable to use index

Reduce the number of interactions with the database

INSERT … ON DUPLICATE KEY UPDATE
REPLACE INTO , INSERT IGNORE , INSERT INTO VALUES(),(),()
UPDATE … WHERE ID IN(10,20,50,…)

Reasonable use Paging

Limit the number of pages displayed by paging. You can only click on the previous page and next page using delayed association

How to use paging correctly?

If there is a paging statement similar to the following: SELECT * FROM table ORDER BY id LIMIT 10000, 10 Because the way LIMIT OFFSET is processed in MySQL is to take out all the data of OFFSET LIMIT, then remove OFFSET, and return to the bottom LIMIT. Therefore, when the OFFSET value is large, MySQL's query performance will be very low. It can be solved by using id > n:

The method of using id > n has limitations. For the problem of discontinuous ids, it can be solved by passing in the last id at the same time when turning the page. .

http://example.com/page.php?last=100 select * from table where id<100 order by id desc limit 10 //上一页 http://example.com/page.php?first=110 select * from table where id>110 order by id desc limit 10

Copy after login

The biggest disadvantage of this method is that if there is an insertion/deletion operation during browsing, the page turning will not be updated, and the total number of pages may still be calculated based on the new count(*), and finally Some records may be inaccessible. In order to fix this problem, you can continue to introduce the current page number and whether there are insertion/deletion operations that affect the total number of records since the last page turn and cache them

select * from table where id >= (select id from table order by id limit #offset#, 1)

Copy after login

Reject large SQL and split it into small ones SQL
Make full use of QUERY CACHE
Make full use of multi-core CPU
Use in instead of or, the value of in should not exceed 1000
It is forbidden to use order by rand ()
Use EXPLAIN diagnostics to avoid generating temporary tables

The EXPLAIN statement (executed in the MySQL client) can obtain information about how MySQL executes the SELECT statement. By executing EXPLAIN on the SELECT statement, you can know whether MySQL uses indexes, full table scans, temporary tables, sorting and other information when executing the SELECT statement. Try to avoid MySQL from performing full table scans, using temporary tables, sorting, etc. See the official documentation for details.

Use union all instead of union

What is the difference between union all and union?

The union and union all keywords both merge two result sets into one, but they are different in terms of usage and efficiency.

Union will filter out duplicate records after table linking, so after table linking, it will sort the result set generated, delete duplicate records, and then return the results. For example:

select * from test_union1 union select * from test_union2

Copy after login

This SQL first takes out the results of the two tables when running, then uses the sorting space to sort and delete duplicate records, and finally returns the result set. If the table data is large, it may cause the disk to be used for processing. Sort.

And union all simply combines the two results and returns them. In this way, if there is duplicate data in the two result sets returned, the returned result set will contain duplicate data.

从效率上说，union all要比union快很多，所以，如果可以确认合并的两个结果集中不包含重复的数据的话，那么就使用union all，如下：

select * from test_union1 union all select * from test_union2

Copy after login

程序应有捕获SQL异常的处理机制
禁止单条SQL语句同时更新多个表
不使用select * ，SELECT语句只获取需要的字段
消耗CPU和IO、消耗网络带宽
无法使用覆盖索引
减少表结构变更带来的影响
因为大，select/join 可能生成临时表
UPDATE、DELETE语句不使用LIMIT
INSERT语句必须显式的指明字段名称，不使用INSERT INTO table()
INSERT语句使用batch提交（INSERT INTO table VALUES(),(),()……），values的个数不超过500
统计表中记录数时使用COUNT(*)，而不是COUNT(primary_key)和COUNT(1) 备注：仅针对Myisam
数据更新建议使用二级索引先查询出主键，再根据主键进行数据更新
禁止使用跨库查询
禁止使用子查询，建议将子查询转换成关联查询
针对varchar类型字段的程序处理，请验证用户输入，不要超出其预设的长度；

分表规范

单表一到两年内数据量超过500w或数据容量超过10G考虑分表，需提前考虑历史数据迁移或应用自行删除历史数据，采用等量均衡分表或根据业务规则分表均可。要分表的数据表必须与DBA商量分表策略

用HASH进行散表，表名后缀使用十进制数，下标从0开始
按日期时间分表需符合YYYY[MM][dd][HH]格式
采用合适的分库分表策略。例如千库十表、十库百表等
禁止使用分区表，分区表对分区键有严格要，分区表在表变大后执行DDL、SHARDING、单表恢复等都变得更加困难。
拆分大字段和访问频率低的字段，分离冷热数据

行为规范

批量导入、导出数据必须提前通知DBA协助观察
禁止在线上从库执行后台管理和统计类查询
禁止有super权限的应用程序账号存在
产品出现非数据库导致的故障时及时通知DBA协助排查
推广活动或上线新功能必须提前通知DBA进行流量评估
数据库数据丢失，及时联系DBA进行恢复
对单表的多次alter操作必须合并为一次操作
不在MySQL数据库中存放业务逻辑
重大项目的数据库方案选型和设计必须提前通知DBA参与
对特别重要的库表，提前与DBA沟通确定维护和备份优先级
不在业务高峰期批量更新、查询数据库其他规范
提交线上建表改表需求，必须详细注明所有相关SQL语句

其他规范

日志类数据不建议存储在MySQL上，优先考虑Hbase或OceanBase，如需要存储请找DBA评估使用压缩表存储。

相关图文教程：mysql数据库图文教程

The above is the detailed content of The most comprehensive sharing of MySQL usage specifications in history. For more information, please follow other related articles on the PHP Chinese website!