Home >Database >SQL >How to optimize SQL queries? (detailed explanation)

How to optimize SQL queries? (detailed explanation)

青灯夜游
青灯夜游forward
2019-11-30 17:54:003497browse

How to optimize SQL queries? (detailed explanation)

Why should we optimize

The throughput bottleneck of the system often appears in the access speed of the database, that is, as the As the application runs, there will be more and more data in the database, and the processing time will slow down accordingly. Moreover, the data is stored on the disk, and the read and write speed cannot be compared with that of the memory.

How Optimization

1. When designing the database: design of database tables and fields, storage engine

2. Make good use of the functions provided by MySQL itself, such as indexes and optimization of statement writing

3. MySQL cluster, sub-database and sub-table, read-write separation

The Internet has a lot of experience in optimizing SQL statements, so this article puts aside these and tries to optimize the DAO layer. and database design optimization, and list two simple examples

Example 1:ERPQuery optimization

Current situation analysis:

1. Lack of associated indexes
2. Mysql itself has limited performance and does not support the association of multiple tables. The current performance is mainly focused on list queries, which are associated with many tables

Countermeasures:

1 Add necessary indexes: View the execution records through explain, and add indexes according to the execution plan;
2 First count the primary keys of the main table of business data, obtain a smaller result set, and then Then use the result set association query;
1) First query and display the primary key of the business data based on the main table and conditions
2) Use the primary key as the query condition, and then associate other related tables to query the required business fields
3 ) When querying the main table, for query conditions that need to be associated with other tables, you need to set up table association only when this condition is set

例如 有如下表 TT_A   TT_B    TT_C  TT_D

假设未优化前的SQL是这样的

SELECT
    A.ID,
    ....
    B.NAME,
    .....
    C.AGE,
    ....
    D.SEX
    .....

FROM  TT_A A
LEFT JOIN TT_B B ON A.ID  = B.ITEM_ID
LEFT JOIN TT_C C ON B.ID = C.ITEM_ID
LEFT JOIN TT_D D ON C.ID = D.ITEM_ID
WHERE 1=1AND A.XX = ?AND A.VV = ?.....

那么优化后的SQL是

第一步

SELECT
    A.ID

FROM  TT_A A
WHERE 1=1AND A.XX = ?AND A.VV = ?第二步

SELECT
    A.ID,
    ....
    B.NAME,
    .....
    C.AGE,
    ....
    D.SEX
    .....
FROM  ( SELECT A.ID,..... FROM  TT_A  WHERE ID IN (1,2,3..)  ) A
LEFT JOIN TT_B B ON A.ID  = B.ITEM_ID
LEFT JOIN TT_C C ON B.ID = C.ITEM_ID
LEFT JOIN TT_D D ON C.ID = D.ITEM_ID
WHERE 1=1AND A.XX = ?AND A.VV = ?

Summary:

This kind of optimization is suitable for list queries, because the conditions of a list query are generally linked to the main table, so use this to establish key field indexes, and at the same time greatly reduce the main table through the restriction of query conditions. amount of data. In this way, it will be much faster when correlating other tables.

Example 2: Article search optimization

Suppose you want to create an article search function for Tieba. The simplest and most direct way is The storage structure is to use a relational database to create such a relational database table TT_ARTICLES for storing articles:

Then, if the current search keyword is "target", we can Use string matching to perform matching queries on the CONTENT column:

select * from ARTICLES where CONTENT like '% 目标 %';

This easily implements the search function. However, this approach has obvious problems, that is, using % for string matching is very inefficient, so such a query needs to traverse the entire table (full table scan). When there are a few or dozens of articles, this is not a problem, but if there are hundreds of thousands or millions of articles, this method is completely unfeasible. Not to mention that a single relational database table cannot accommodate such large data. Even if it can accommodate it, it needs to be scanned again. The time cost here is unimaginable

So, we have to introduce "inversion" "indexing" technology. In the scenario described above, we can split this concept into two parts to explain: Okay, the ARTICLES table above still exists, but now a keyword table KEYWORDS needs to be added, and the KEYWORD column needs to be indexed. Therefore, the record of this keyword can be quickly found:

Of course, we also need a relationship table to combine the KEYWORDS table and the ARTICLES table, KEYWORD_ID and ARTICLE_ID as unions Primary key

#You see, this is actually a many-to-many relationship, that is, the same keyword can appear in multiple articles, and one article can contain multiple different keywords. In this way, we can first find the corresponding KEYWORD_ID from the KEYWARDS table based on the indexed keywords, then find the ARTICLE_ID based on its association table above, and then use it to find the corresponding article in the ARTICLES table.

Summary:

This seems to be three searches, but because the index is used each time, a full table scan is eliminated. When the amount of data is small, The speed is not slow, and when implemented using SQL, this process can be completely put into a SQL statement. When the amount of data is small, the above method is good enough. This solves the performance problems caused by full table scans and string % match queries.

Summary:

During the technical interview, if you can give practical examples, or directly talk about the problems and gains in your development process, the interview branch will be Add a lot, and your answers should be more logical. Don't go here and there, which can easily confuse yourself. For example, when asked about how to optimize SQL, you should not directly answer adding indexes. You can answer like this:

Hello interviewer, first of all, our project DB data volume has encountered a bottleneck, resulting in list query being very slow and giving users a poor experience. In order to solve this problem, there are many methods, such as the most basic database table design, Basic SQL optimization, MYSQL clustering, read-write separation, sub-database and sub-table, adding a cache layer to the architecture, etc. Their advantages and disadvantages... We combine these and then combine them with the characteristics of our project. Finally, we choose when selecting technology. Who.

If you answer the questions in such an orderly and well-reasoned way and also talk about so many knowledge points outside the questions, the interviewer will think that you are not just a person who can write code, but that you have clear logic. You have your own understanding and thinking about technology selection

This article comes from the SQL Tutorial column, welcome to learn!

The above is the detailed content of How to optimize SQL queries? (detailed explanation). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:cnblogs.com. If there is any infringement, please contact admin@php.cn delete