Table of Contents
Usage of distinct
For example:
For basic deduplication, the use of
Syntax:
The syntax difference between the two is that
In most examples,
For implicit sorting, we can refer to MySQL’s official explanation:
Home Database Mysql Tutorial How to use distinct and group by in MySQL

How to use distinct and group by in MySQL

May 26, 2023 am 10:34 AM
mysql distinct group by

    Let’s talk about the general conclusion first:

    • When the semantics are the same and there is an index: group by## Both # and distinct can use indexes with the same efficiency.

    • With the same semantics and no index:

      distinct is more efficient than group by. The reason is that both distinct and group by will perform grouping operations, but group by may perform sorting and trigger filesort, resulting in low SQL execution efficiency.

    Based on this conclusion, you may ask:

    • Why

      group by# when the semantics are the same and there is an index? ## and distinct have the same efficiency?

    • Under what circumstances will
    • group by

      perform a sorting operation?

    • Find answers to these two questions. Next, let’s take a look at the basic use of
    distinct

    and group by. Usage of distinct

    Usage of distinct

    SELECT DISTINCT columns FROM table_name WHERE where_conditions;

    For example:

    mysql> select distinct age from student;
    +------+
    | age  |
    +------+
    |   10 |
    |   12 |
    |   11 |
    | NULL |
    +------+
    4 rows in set (0.01 sec)

    DISTINCT

    The keyword is used to return uniquely different values. It is used before the first field in the query statement and applies to all columns in the main clause. If a column has a NULL value and you use the

    DISTINCT

    clause on the column, MySQL will retain one NULL value and delete the other NULL values ​​because the DISTINCT clause statement treats all NULL values ​​as the same value. distinct Multi-column deduplication

    distinct

    Multi-column deduplication is performed based on the specified deduplication column information, that is, only all specified column information If they are all the same, it will be considered as duplicate information. <pre class='brush:php;toolbar:false;'>SELECT DISTINCT column1,column2 FROM table_name WHERE where_conditions; mysql&gt; select distinct sex,age from student; +--------+------+ | sex | age | +--------+------+ | male | 10 | | female | 12 | | male | 11 | | male | NULL | | female | 11 | +--------+------+ 5 rows in set (0.02 sec)</pre>Usage of group by

    For basic deduplication, the use of

    group by

    is similar to distinct. Single column deduplication

    Syntax:

    SELECT columns FROM table_name WHERE where_conditions GROUP BY columns;

    Execution:

    mysql> select age from student group by age;
    +------+
    | age  |
    +------+
    |   10 |
    |   12 |
    |   11 |
    | NULL |
    +------+
    4 rows in set (0.02 sec)

    Multiple column deduplication

    Syntax:

    SELECT columns FROM table_name WHERE where_conditions GROUP BY columns;

    Execution:

    mysql> select sex,age from student group by sex,age;
    +--------+------+
    | sex    | age  |
    +--------+------+
    | male   |   10 |
    | female |   12 |
    | male   |   11 |
    | male   | NULL |
    | female |   11 |
    +--------+------+
    5 rows in set (0.03 sec)

    Difference example

    The syntax difference between the two is that

    group by

    can perform single-column deduplication, and the principle of group by The results are grouped and sorted first, and then the first piece of data in each group is returned. And deduplication is performed based on the fields following group by. For example:

    mysql> select sex,age from student group by sex;
    +--------+-----+
    | sex    | age |
    +--------+-----+
    | male   |  10 |
    | female |  12 |
    +--------+-----+
    2 rows in set (0.03 sec)

    distinct and group by principle

    In most examples,

    DISTINCT

    can be regarded as a special GROUP BY, their implementation is based on grouping operations, and they can all be implemented through loose index scan and compact index scan (the content of index scan will be introduced in detail in other articles, so I will not introduce it in detail here).

    DISTINCT

    and GROUP BY can both be scanned and searched using indexes. For example, the following two SQLs (just look at the content of the extra at the end of the table), we analyze these two SQLs, we can see that in the extra, these two SQLs use compact index scanningUsing index for group -by. So, in general, for

    DISTINCT

    and GROUP BY statements with the same semantics, we can use the same index optimization methods to optimize them. <pre class='brush:php;toolbar:false;'>mysql&gt; explain select int1_index from test_distinct_groupby group by int1_index; +----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ | 1 | SIMPLE | test_distinct_groupby | NULL | range | index_1 | index_1 | 5 | NULL | 955 | 100.00 | Using index for group-by | +----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ 1 row in set (0.05 sec) mysql&gt; explain select distinct int1_index from test_distinct_groupby; +----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ | 1 | SIMPLE | test_distinct_groupby | NULL | range | index_1 | index_1 | 5 | NULL | 955 | 100.00 | Using index for group-by | +----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ 1 row in set (0.05 sec)</pre>But for

    GROUP BY

    , before MYSQL8.0, GROUP Y will be implicitly sorted by fields by default. As you can see, the following sql statement uses a temporary table and also performs filesort.

    mysql> explain select int6_bigger_random from test_distinct_groupby GROUP BY int6_bigger_random;
    +----+-------------+-----------------------+------------+------+---------------+------+---------+------+-------+----------+---------------------------------+
    | id | select_type | table                 | partitions | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra                           |
    +----+-------------+-----------------------+------------+------+---------------+------+---------+------+-------+----------+---------------------------------+
    |  1 | SIMPLE      | test_distinct_groupby | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 97402 |   100.00 | Using temporary; Using filesort |
    +----+-------------+-----------------------+------------+------+---------------+------+---------+------+-------+----------+---------------------------------+
    1 row in set (0.04 sec)

    Implicit sorting

    For implicit sorting, we can refer to MySQL’s official explanation:

    https://dev.mysql.com/doc/refman/5.7 /en/order-by-optimization.html

    GROUP BY implicitly sorts by default (that is, in the absence of ASC or DESC designators for GROUP BY columns). However, relying on implicit GROUP BY sorting (that is, sorting in the absence of ASC or DESC designators) or explicit sorting for GROUP BY (that is, by using explicit ASC or DESC designators for GROUP BY columns) is deprecated. To produce a given sort order, provide an ORDER BY clause.

    Broad explanation:

    GROUP BY defaults to implicit sorting (meaning that sorting will also be performed even if the GROUP BY column does not have an ASC or DESC indicator). However, GROUP BY for explicit or implicit sorting is deprecated. To generate a given sort order, provide an ORDER BY clause.

    So, before MySQL8.0,

    GROUP BY

    will sort the results according to the effect field (the subsequent field of GROUP BY) by default. When the index can be used, GROUP BY does not require additional sorting operations; but when the index cannot be used for sorting, the MySQL optimizer has to choose to use a temporary table and then sort itGROUP BY. And when the size of the result set exceeds the temporary table size set by the system, MySQL will copy the temporary table data to the disk before operating, and the execution efficiency of the statement will become extremely low. This is why MySQL has chosen to deprecate this operation (implicit sorting).

    Based on the above reasons, Mysql has optimized and updated this in 8.0:

    https://dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html

    Previously (MySQL 5.7 and lower), GROUP BY sorted implicitly under certain conditions. In MySQL 8.0, that no longer occurs, so specifying ORDER BY NULL at the end to suppress implicit sorting (as was done previously) is no longer necessary. However, query results may differ from previous MySQL versions. To produce a given sort order, provide an ORDER BY clause.

    A rough explanation:

    In the past (before MySQL5.7 version), Group by would perform implicit sorting based on certain conditions. In MySQL 8.0, this functionality has been removed, so it is no longer necessary to disable implicit ordering by adding order by null, however, query results may differ from previous MySQL versions. To produce results in a given order, specify the fields to be sorted by ORDER BY.

    Therefore, our conclusion also comes out:

    • In the case of the same semantics and index: group by and distinct Both can use indexes and have the same efficiency. Because group by and distinct are nearly equivalent, distinct can be regarded as a special group by.

    • In the case of the same semantics and no index: distinct is more efficient than group by. The reason is that both distinct and group by will perform grouping operations, but group by will perform implicit sorting before MySQL8.0, causing filesort to be triggered and sql execution efficiency low. However, starting from MySQL8.0, MySQL has deleted the implicit sorting. Therefore, at this time, with the same semantics and no index, the execution efficiency of group by and distinct is almost the same. equivalent.

    Compared with distinct, group by has clear semantics. And since the distinct keyword will take effect on all fields, group by is more flexible when performing composite business processing. group by can update the data according to the grouping situation. For complex processing, such as filtering data through having, or operating on data through aggregate functions.

    The above is the detailed content of How to use distinct and group by in MySQL. For more information, please follow other related articles on the PHP Chinese website!

    Statement of this Website
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

    Hot AI Tools

    Undress AI Tool

    Undress AI Tool

    Undress images for free

    Undresser.AI Undress

    Undresser.AI Undress

    AI-powered app for creating realistic nude photos

    AI Clothes Remover

    AI Clothes Remover

    Online AI tool for removing clothes from photos.

    Clothoff.io

    Clothoff.io

    AI clothes remover

    Video Face Swap

    Video Face Swap

    Swap faces in any video effortlessly with our completely free AI face swap tool!

    Hot Tools

    Notepad++7.3.1

    Notepad++7.3.1

    Easy-to-use and free code editor

    SublimeText3 Chinese version

    SublimeText3 Chinese version

    Chinese version, very easy to use

    Zend Studio 13.0.1

    Zend Studio 13.0.1

    Powerful PHP integrated development environment

    Dreamweaver CS6

    Dreamweaver CS6

    Visual web development tools

    SublimeText3 Mac version

    SublimeText3 Mac version

    God-level code editing software (SublimeText3)

    Hot Topics

    PHP Tutorial
    1488
    72
    How to use PHP to develop a Q&A community platform Detailed explanation of PHP interactive community monetization model How to use PHP to develop a Q&A community platform Detailed explanation of PHP interactive community monetization model Jul 23, 2025 pm 07:21 PM

    1. The first choice for the Laravel MySQL Vue/React combination in the PHP development question and answer community is the first choice for Laravel MySQL Vue/React combination, due to its maturity in the ecosystem and high development efficiency; 2. High performance requires dependence on cache (Redis), database optimization, CDN and asynchronous queues; 3. Security must be done with input filtering, CSRF protection, HTTPS, password encryption and permission control; 4. Money optional advertising, member subscription, rewards, commissions, knowledge payment and other models, the core is to match community tone and user needs.

    How to set environment variables in PHP environment Description of adding PHP running environment variables How to set environment variables in PHP environment Description of adding PHP running environment variables Jul 25, 2025 pm 08:33 PM

    There are three main ways to set environment variables in PHP: 1. Global configuration through php.ini; 2. Passed through a web server (such as SetEnv of Apache or fastcgi_param of Nginx); 3. Use putenv() function in PHP scripts. Among them, php.ini is suitable for global and infrequently changing configurations, web server configuration is suitable for scenarios that need to be isolated, and putenv() is suitable for temporary variables. Persistence policies include configuration files (such as php.ini or web server configuration), .env files are loaded with dotenv library, and dynamic injection of variables in CI/CD processes. Security management sensitive information should be avoided hard-coded, and it is recommended to use.en

    Automating MySQL Deployments with Infrastructure as Code Automating MySQL Deployments with Infrastructure as Code Jul 20, 2025 am 01:49 AM

    To achieve MySQL deployment automation, the key is to use Terraform to define resources, Ansible management configuration, Git for version control, and strengthen security and permission management. 1. Use Terraform to define MySQL instances, such as the version, type, access control and other resource attributes of AWSRDS; 2. Use AnsiblePlaybook to realize detailed configurations such as database user creation, permission settings, etc.; 3. All configuration files are included in Git management, support change tracking and collaborative development; 4. Avoid hard-coded sensitive information, use Vault or AnsibleVault to manage passwords, and set access control and minimum permission principles.

    How to use PHP to develop product recommendation module PHP recommendation algorithm and user behavior analysis How to use PHP to develop product recommendation module PHP recommendation algorithm and user behavior analysis Jul 23, 2025 pm 07:00 PM

    To collect user behavior data, you need to record browsing, search, purchase and other information into the database through PHP, and clean and analyze it to explore interest preferences; 2. The selection of recommendation algorithms should be determined based on data characteristics: based on content, collaborative filtering, rules or mixed recommendations; 3. Collaborative filtering can be implemented in PHP to calculate user cosine similarity, select K nearest neighbors, weighted prediction scores and recommend high-scoring products; 4. Performance evaluation uses accuracy, recall, F1 value and CTR, conversion rate and verify the effect through A/B tests; 5. Cold start problems can be alleviated through product attributes, user registration information, popular recommendations and expert evaluations; 6. Performance optimization methods include cached recommendation results, asynchronous processing, distributed computing and SQL query optimization, thereby improving recommendation efficiency and user experience.

    mysql revoke privileges from user mysql revoke privileges from user Jul 16, 2025 am 03:56 AM

    To recycle MySQL user permissions using REVOKE, you need to specify the permission type, database, and user by format. 1. Use REVOKEALLPRIVILEGES, GRANTOPTIONFROM'username'@'hostname'; 2. Use REVOKEALLPRIVILEGESONmydb.FROM'username'@'hostname'; 3. Use REVOKEALLPRIVILEGESONmydb.FROM'username'@'hostname'; 3. Use REVOKE permission type ON.*FROM'username'@'hostname'; Note that after execution, it is recommended to refresh the permissions. The scope of the permissions must be consistent with the authorization time, and non-existent permissions cannot be recycled.

    How to build an online customer service robot with PHP. PHP intelligent customer service implementation technology How to build an online customer service robot with PHP. PHP intelligent customer service implementation technology Jul 25, 2025 pm 06:57 PM

    PHP plays the role of connector and brain center in intelligent customer service, responsible for connecting front-end input, database storage and external AI services; 2. When implementing it, it is necessary to build a multi-layer architecture: the front-end receives user messages, the PHP back-end preprocesses and routes requests, first matches the local knowledge base, and misses, call external AI services such as OpenAI or Dialogflow to obtain intelligent reply; 3. Session management is written to MySQL and other databases by PHP to ensure context continuity; 4. Integrated AI services need to use Guzzle to send HTTP requests, safely store APIKeys, and do a good job of error handling and response analysis; 5. Database design must include sessions, messages, knowledge bases, and user tables, reasonably build indexes, ensure security and performance, and support robot memory

    Securing MySQL Connections with SSL/TLS Encryption Securing MySQL Connections with SSL/TLS Encryption Jul 21, 2025 am 02:08 AM

    Why do I need SSL/TLS encryption MySQL connection? Because unencrypted connections may cause sensitive data to be intercepted, enabling SSL/TLS can prevent man-in-the-middle attacks and meet compliance requirements; 2. How to configure SSL/TLS for MySQL? You need to generate a certificate and a private key, modify the configuration file to specify the ssl-ca, ssl-cert and ssl-key paths and restart the service; 3. How to force SSL when the client connects? Implemented by specifying REQUIRESSL or REQUIREX509 when creating a user; 4. Details that are easily overlooked in SSL configuration include certificate path permissions, certificate expiration issues, and client configuration requirements.

    How to develop AI intelligent form system with PHP PHP intelligent form design and analysis How to develop AI intelligent form system with PHP PHP intelligent form design and analysis Jul 25, 2025 pm 05:54 PM

    When choosing a suitable PHP framework, you need to consider comprehensively according to project needs: Laravel is suitable for rapid development and provides EloquentORM and Blade template engines, which are convenient for database operation and dynamic form rendering; Symfony is more flexible and suitable for complex systems; CodeIgniter is lightweight and suitable for simple applications with high performance requirements. 2. To ensure the accuracy of AI models, we need to start with high-quality data training, reasonable selection of evaluation indicators (such as accuracy, recall, F1 value), regular performance evaluation and model tuning, and ensure code quality through unit testing and integration testing, while continuously monitoring the input data to prevent data drift. 3. Many measures are required to protect user privacy: encrypt and store sensitive data (such as AES

    See all articles