Table of Contents
(1) Definition of windowing function
(2) Window function Practical application scenarios
Home Database Mysql Tutorial How to use windowing functions in MySQL

How to use windowing functions in MySQL

May 30, 2023 pm 03:10 PM
mysql

(1) Definition of windowing function

Windowing function is also called OLAP function (Online Analytical Processing, online analytical processing), which is mainly used to analyze and process data in real time. Before MySQL version 8.0, windowing functions were not supported, but support for windowing functions has been provided since this version.

# 开窗函数语法 
func_name(<parameter>) 
OVER([PARTITION BY <part_by_condition>] 
[ORDER BY <order_by_list> ASC|DESC])

Window function statement analysis:
The function is divided into two parts, one part is the function name, the number of window functions is relatively small, there are only 11 window function aggregate functions in total (all aggregate functions are Can be used as a windowing function). Depending on the nature of the function, some need to write parameters and some do not.

The other part is the over statement. over() must be written. The parameters inside are all optional and can be used selectively according to needs:

  • Chapter One parameter is the partition by field, which means dividing the data set into multiple parts based on this field

  • The second parameter is the order by field, the data of each window is based on this field Arrange in ascending or descending order

How to use windowing functions in MySQL

The windowing function is similar to the grouping aggregation function. They both divide the data into multiple parts by specifying fields. The difference is:

  • The SQL standard allows all aggregate functions to be used as window functions, using the OVER keyword to distinguish window functions from aggregate functions.

  • Aggregation functions only return one value per group, while window functions can return multiple values ​​per group.

Among these 11 windowing functions, the three sorting functions ROW_NUMBER(), RANK(), and DENSE_RANK() are most commonly used in actual work. Let's learn these three windowing functions through a simple data set.

# 首先创建虚拟的业务员销售数据 
CREATE TABLE Sales
( 
idate date, 
iname char(2), 
sales int
); 
# 向表中插入数据 
INSERT INTO Sales VALUES 
(&#39;2021/1/1&#39;, &#39;丁一&#39;, 200), 
(&#39;2021/2/1&#39;, &#39;丁一&#39;, 180), 
(&#39;2021/2/1&#39;, &#39;李四&#39;, 100), 
(&#39;2021/3/1&#39;, &#39;李四&#39;, 150), 
(&#39;2021/2/1&#39;, &#39;刘猛&#39;, 180), 
(&#39;2021/3/1&#39;, &#39;刘猛&#39;, 150), 
(&#39;2021/1/1&#39;, &#39;王二&#39;, 200), 
(&#39;2021/2/1&#39;, &#39;王二&#39;, 180), 
(&#39;2021/3/1&#39;, &#39;王二&#39;, 300), 
(&#39;2021/1/1&#39;, &#39;张三&#39;, 300), 
(&#39;2021/2/1&#39;, &#39;张三&#39;, 280), 
(&#39;2021/3/1&#39;, &#39;张三&#39;, 280); 
# 数据查询 
SELECT * FROM Sales; 
# 查询各月中销售业绩最差的业务员
SELECT month(idate),iname,sales, 
	ROW_NUMBER() 
	OVER(PARTITION BY month(idate) 
			 ORDER BY sales) as sales_order 
FROM Sales;

SELECT * FROM 
(SELECT month(idate),iname,sales, 
	 ROW_NUMBER() 
	 OVER(PARTITION BY month(idate) 
   ORDER BY sales) as sales_order FROM Sales) as t
WHERE sales_order=1;

How to use windowing functions in MySQL

# ROW_NUMBER()、RANK()、DENSE_RANK()的区别 
SELECT * FROM 
(SELECT month(idate) as imonth,iname,sales, 
ROW_NUMBER() 
OVER(PARTITION BY month(idate) ORDER BY sales) as row_order,
RANK() 
OVER(PARTITION BY month(idate) ORDER BY sales) as rank_order, 
DENSE_RANK() 
OVER(PARTITION BY month(idate) ORDER BY sales) as dense_order 
FROM Sales) as t;

How to use windowing functions in MySQL

ROW_NUMBER(): Sequential sorting——1, 2, 3
RANK(): Parallel sorting , skip repeated serial numbers - 1, 1, 3
DENSE_RANK(): sort in parallel, do not skip repeated serial numbers - 1, 1, 2

(2) Window function Practical application scenarios

In work or interviews, you may encounter situations where you need to ask users for the number of consecutive login days or check-in days. The following provides an idea for using windowing functions to solve such problems.

# 首先创建虚拟的用户登录表,并插入数据 
create table user_login
( 
user_id varchar(100), 
login_time datetime
); 

insert into user_login values 
(1,&#39;2020-11-25 13:21:12&#39;), 
(1,&#39;2020-11-24 13:15:22&#39;), 
(1,&#39;2020-11-24 10:30:15&#39;), 
(1,&#39;2020-11-24 09:18:27&#39;), 
(1,&#39;2020-11-23 07:43:54&#39;), 
(1,&#39;2020-11-10 09:48:36&#39;), 
(1,&#39;2020-11-09 03:30:22&#39;), 
(1,&#39;2020-11-01 15:28:29&#39;), 
(1,&#39;2020-10-31 09:37:45&#39;), 
(2,&#39;2020-11-25 13:54:40&#39;), 
(2,&#39;2020-11-24 13:22:32&#39;), 
(2,&#39;2020-11-23 10:55:52&#39;), 
(2,&#39;2020-11-22 06:30:09&#39;), 
(2,&#39;2020-11-21 08:33:15&#39;), 
(2,&#39;2020-11-20 05:38:18&#39;), 
(2,&#39;2020-11-19 09:21:42&#39;), 
(2,&#39;2020-11-02 00:19:38&#39;), 
(2,&#39;2020-11-01 09:03:11&#39;), 
(2,&#39;2020-10-31 07:44:55&#39;), 
(2,&#39;2020-10-30 08:56:33&#39;), 
(2,&#39;2020-10-29 09:30:28&#39;); 
# 查看数据 
SELECT * FROM user_login;

There are usually three situations in calculating the number of consecutive login days:

  • View the continuous login status of each user

  • View the maximum number of consecutive login days for each user

  • View users who have logged in for more than N days in a certain period

For The first situation: Check the continuous login situation of each user
Based on actual experience, we know that within a period of time, users may have multiple consecutive logins. We have to output this information, so the fields of the final result output can be User ID, first login date, end login date, and number of consecutive login days.

# 数据预处理:由于统计的窗口期是天数,所以可以对登录时间字段进行格式转换,将其变成日期格式然后再去重(去掉用户同一天内多次登录的情况) 
# 为方便后续代码查看,将处理结果放置新表中,一步一步操作 
create table user_login_date(
select distinct user_id, date(login_time) login_date from user_login);
# 处理后的数据如下: 
select * from user_login_date;

# 第一种情况:查看每位用户连续登陆的情况 
# 对用户登录数据进行排序 
create table user_login_date_1( 
select *,
rank() over(partition by user_id order by login_date) irank 
from user_login_date); 
#查看结果 
select * from user_login_date_1;
 
# 增加辅助列,帮助判断用户是否连续登录 
create table user_login_date_2( 
select *,
date_sub(login_date, interval irank DAY) idate  #data_sub从指定的日期减去指定的时间间隔
from user_login_date_1); 
# 查看结果 
select * from user_login_date_2; 

# 计算每位用户连续登录天数 
select user_id, 
min(login_date) as start_date, 
max(login_date) as end_date, 
count(login_date) as days 
from user_login_date_2 
group by user_id,idate;

# ===============【整合代码,解决用户连续登录问题】=================== 
select user_id, 
       min(login_date) start_date, 
       max(login_date) end_date, 
       count(login_date) days 
from (select *,date_sub(login_date, interval irank day) idate 
from (select *,rank() over(partition by user_id order by login_date) irank 
from (select distinct user_id, date(login_time) login_date from user_login) as a) as b) as c 
group by user_id,idate;

For the second case: Check the maximum number of consecutive login days for each user

# 计算每个用户最大连续登录天数 
select user_id,max(days) from 
(select user_id, 
			 min(login_date) start_date, 
			 max(login_date) end_date, 
			 count(login_date) days 
from (select *,date_sub(login_date, interval irank day) idate 
from (select *,rank() over(partition by user_id order by login_date) irank 
from (select distinct user_id, date(login_time) login_date from user_login) as a) as b) as c 
group by user_id,idate) as d 
group by user_id;

For the third case: Check for users who have logged in for more than N days in a certain period of time

If we need to view users who logged in for 5 or more consecutive days between October 29th and November 25th, how to achieve this? . This requirement can also be filtered using the results of the query in the first case.

# 查看在这段时间内连续登录天数≥5天的用户 
select distinct user_id from 
(select user_id, 
		min(login_date) start_date, 
		max(login_date) end_date, 
		count(login_date) days 
from (select *,date_sub(login_date, interval irank day) idate 
from (select *,rank() over(partition by user_id order by login_date) irank 
from (select distinct user_id, date(login_time) login_date from user_login) as a) as b) as c 
group by user_id,idate 
having days>=5
) as d;

This way of writing can get results, but it is a bit troublesome for this problem. Here is a simple method: refer to a new static window function lead()

select *, 
lead(login_date,4) over(partition by user_id order by login_date) as idate5 
from user_login_date;

The lead function has three parameters. The first parameter is the specified column (the login date is used here), and the second parameter is the value of several rows after the current row. Here, 4 is used, which is the date of the fifth login. The third parameter is if the returned null value can be replaced with the specified value, the third parameter is not used here. In the over clause, windows are grouped by user_id, and the data in each window is arranged in ascending order by login date.

Use the fifth login date - login_date 1. If it is equal to 5, it means that you have logged in for five consecutive days. If you get a null value or greater than 5, it means that you have not logged in for five consecutive days. The code and results are as follows:

# 计算第5次登录日期与当天的差值 
select *,datediff(idate5,login_date)+1 days 
from (select *,lead(login_date,4) over(partition by user_id order by login_date) idate5
from user_login_date) as a; 
# 找出相差天数为5的记录 
select distinct user_id 
from (select *,datediff(idate5,login_date)+1 as days 
from (select *,lead(login_date,4) over(partition by user_id order by login_date) idate5 
from user_logrin_date) as a)as b 
where days = 5;

[Exercise] Meituan takeout platform data analysis interview questions--SQL
The existing transaction data table user_goods_table is as follows:

How to use windowing functions in MySQL

Now the boss wants to Know the preference distribution of takeaway categories purchased by each user, and find out which takeaway category each user purchases the most.

# 分析题目:要求输出字段为用户名user_name,该用户购买最多的外卖品类goods_kind 
# 解题思路:这是一个分组排序的问题,可以考虑窗口函数 
# 第一步:使用窗口函数row_number(),对每个用户购买的外卖品类进行分组统计与排名
select user_name,goods_kind,count(goods_kind),
rank() over (partition by user_name order by count(goods_kind) desc) as irank
from user_goods_table
group by user_name,goods_kind;

# 第二步:筛选出每个用户排名第一的外卖品类
select user_id,goods_kind from 
(select user_name,goods_kind,count(goods_kind),
rank() over (partition by user_name order by count(goods_kind) desc) as irank
from user_goods_table
group by user_name,goods_kind) as a 
where irank=1

The above is the detailed content of How to use windowing functions in MySQL. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

PHP Tutorial
1598
276
How to show all databases in MySQL How to show all databases in MySQL Aug 08, 2025 am 09:50 AM

To display all databases in MySQL, you need to use the SHOWDATABASES command; 1. After logging into the MySQL server, you can execute the SHOWDATABASES; command to list all databases that the current user has permission to access; 2. System databases such as information_schema, mysql, performance_schema and sys exist by default, but users with insufficient permissions may not be able to see it; 3. You can also query and filter the database through SELECTSCHEMA_NAMEFROMinformation_schema.SCHEMATA; for example, excluding the system database to only display the database created by users; make sure to use

How to add a primary key to an existing table in MySQL? How to add a primary key to an existing table in MySQL? Aug 12, 2025 am 04:11 AM

To add a primary key to an existing table, use the ALTERTABLE statement with the ADDPRIMARYKEY clause. 1. Ensure that the target column has no NULL value, no duplication and is defined as NOTNULL; 2. The single-column primary key syntax is ALTERTABLE table name ADDPRIMARYKEY (column name); 3. The multi-column combination primary key syntax is ALTERTABLE table name ADDPRIMARYKEY (column 1, column 2); 4. If the column allows NULL, you must first execute MODIFY to set NOTNULL; 5. Each table can only have one primary key, and the old primary key must be deleted before adding; 6. If you need to increase it yourself, you can use MODIFY to set AUTO_INCREMENT. Ensure data before operation

How to Troubleshoot Common MySQL Connection Errors? How to Troubleshoot Common MySQL Connection Errors? Aug 08, 2025 am 06:44 AM

Check whether the MySQL service is running, use sudosystemctlstatusmysql to confirm and start; 2. Make sure that bind-address is set to 0.0.0.0 to allow remote connections and restart the service; 3. Verify whether the 3306 port is open, check and configure the firewall rules to allow the port; 4. For the "Accessdenied" error, you need to check the user name, password and host name, and then log in to MySQL and query the mysql.user table to confirm permissions. If necessary, create or update the user and authorize it, such as using 'your_user'@'%'; 5. If authentication is lost due to caching_sha2_password

How to back up a database in MySQL How to back up a database in MySQL Aug 11, 2025 am 10:40 AM

Using mysqldump is the most common and effective way to back up MySQL databases. It can generate SQL scripts containing table structure and data. 1. The basic syntax is: mysqldump-u[user name]-p[database name]>backup_file.sql. After execution, enter the password to generate a backup file. 2. Back up multiple databases with --databases option: mysqldump-uroot-p--databasesdb1db2>multiple_dbs_backup.sql. 3. Back up all databases with --all-databases: mysqldump-uroot-p

Explain database indexing strategies (e.g., B-Tree, Full-text) for a MySQL-backed PHP application. Explain database indexing strategies (e.g., B-Tree, Full-text) for a MySQL-backed PHP application. Aug 13, 2025 pm 02:57 PM

B-TreeindexesarebestformostPHPapplications,astheysupportequalityandrangequeries,sorting,andareidealforcolumnsusedinWHERE,JOIN,orORDERBYclauses;2.Full-Textindexesshouldbeusedfornaturallanguageorbooleansearchesontextfieldslikearticlesorproductdescripti

What is the difference between UNION and UNION ALL in MySQL? What is the difference between UNION and UNION ALL in MySQL? Aug 14, 2025 pm 05:25 PM

UNIONremovesduplicateswhileUNIONALLkeepsallrowsincludingduplicates;1.UNIONperformsdeduplicationbysortingandcomparingrows,returningonlyuniqueresults,whichmakesitsloweronlargedatasets;2.UNIONALLincludeseveryrowfromeachquerywithoutcheckingforduplicates,

How to change the GROUP_CONCAT separator in MySQL How to change the GROUP_CONCAT separator in MySQL Aug 22, 2025 am 10:58 AM

You can customize the separator by using the SEPARATOR keyword in the GROUP_CONCAT() function; 1. Use SEPARATOR to specify a custom separator, such as SEPARATOR'; 'The separator can be changed to a semicolon and plus space; 2. Common examples include using the pipe character '|', space'', line break character '\n' or custom string '->' as the separator; 3. Note that the separator must be a string literal or expression, and the result length is limited by the group_concat_max_len variable, which can be adjusted by SETSESSIONgroup_concat_max_len=10000; 4. SEPARATOR is optional

How to lock tables in MySQL How to lock tables in MySQL Aug 15, 2025 am 04:04 AM

The table can be locked manually using LOCKTABLES. The READ lock allows multiple sessions to read but cannot be written. The WRITE lock provides exclusive read and write permissions for the current session and other sessions cannot read and write. 2. The lock is only for the current connection. Execution of STARTTRANSACTION and other commands will implicitly release the lock. After locking, it can only access the locked table; 3. Only use it in specific scenarios such as MyISAM table maintenance and data backup. InnoDB should give priority to using transaction and row-level locks such as SELECT...FORUPDATE to avoid performance problems; 4. After the operation is completed, UNLOCKTABLES must be explicitly released, otherwise resource blockage may occur.

See all articles