Home  >  Article  >  Database  >  Detailed explanation of SQL window functions: the use of ranking window functions

Detailed explanation of SQL window functions: the use of ranking window functions

WBOY
WBOYforward
2022-09-08 17:44:472115browse

This article brings you relevant knowledge about SQL server, which mainly introduces SQL Server primary key constraints (PRIMARY KEY). The primary key is a column or a group that uniquely identifies each row in the table. Column, the article expands on the topic in detail. Let’s take a look at it together. I hope it will be helpful to everyone.

Detailed explanation of SQL window functions: the use of ranking window functions

Recommended study: "SQL Tutorial"

For the basics of window functions, please see the article SQL Window Functions

The value window function can be used to return the data row at the specified position within the window. Common value window functions are as follows:

LAG function can return the Nth row of data before the current row in the window. The LEAD function can return the Nth row of data after the current row in the window. The FIRST_VALUE function can return the first row of data in the window. The LAST_VALUE function can return the last row of data in the window. The NTH_VALUE function can return the Nth row of data in the window.

Among them, the LAG function and LEAD function do not support dynamic window size. They use the entire partition as the analysis window.

Case analysis

Example table used in the case

The following query will use a table. The sales_monthly table stores product sales information, and product represents the product name. ym represents the year and month, and amount represents the sales amount (yuan).

The following is some data in the table:

The initialization script of this table can be obtained at the bottom of the article.

1. Period-on-month analysis

Period-on-month growth refers to the growth of the current period's data compared with the previous period's data. For example, the sales of a product in June 2019 and the sales in May 2019 Compared to the increased portion.

The following statement counts the month-on-month growth rate of various products:

SELECT s.product AS "产品", s.ym AS "年月", s.amount AS "销售额",
 ( 
    (s.amount - LAG(s.amount,1) OVER (PARTITION BY product ORDER BY s.ym))/
    LAG(s.amount,1) OVER (PARTITION BY product ORDER BY s.ym)
 ) * 100 AS "环比增长率(%)"
FROM sales_monthly s
ORDER BY s.product,s.ym

Among them, LAG (amount, 1) means to obtain the sales volume of the previous period, and the PARTITION BY option means according to For product partitioning, the ORDER BY option indicates sorting by month.

The sales amount of the current month minus the sales amount of the previous period, and then divided by the sales amount of the previous period, is the month-on-month growth rate.

The results returned by this query are as follows:

#January 2018 is the first period, so its month-on-month growth rate is empty.

The month-on-month growth rate of "Orange" in February 2018 was approximately 0.2856% ((10183-10154)/10154×100), and so on.

2. Year-on-year analysis

Year-over-year growth refers to the growth of the current period's data compared with the previous year or the same period in history. For example, the sales of a product in June 2019 were compared with those in June 2018. The increase compared to monthly sales.

The following statement counts the year-on-year growth rate of various products every month:

SELECT s.product AS "产品", s.ym AS "年月", s.amount AS "销售额",
 ( 
    (s.amount - LAG(s.amount,12) OVER (PARTITION BY product ORDER BY s.ym))/
    LAG(s.amount,12) OVER (PARTITION BY product ORDER BY s.ym)
 ) * 100 AS "同比增长率(%)"
FROM sales_monthly s
ORDER BY s.product,s.ym

Among them, LAG (amount, 12) represents the sales volume of the 12th period before the current month, that is, last year sales in the same month.

PARTITION BY option indicates partitioning by product, and ORDER BY option indicates sorting by month.

The sales amount of the current month minus the sales of the same period last year, divided by the sales of the same period last year, is the year-on-year growth rate.

The results returned by this query are as follows:

#There is no corresponding year-on-year growth rate for the 12 periods of data in 2018. "Orange" in January 2019 The year-on-year growth rate is approximately 9.3067% ((11099-10154)/10154×100), and so on.

Tip: The LEAD function is similar to the LAG function, but its return result is the Nth row of data after the current row.

3. Compound growth rate

The compound growth rate is the result of dividing the data of the Nth period by the benchmark data of the first period, then raising it to the power of N-1 and subtracting 1.

Suppose the product sales in 2018 are 10,000, the product sales in 2019 are 12,500, and the product sales in 2020 are 15,000. Then the compound growth rate of these two years is calculated as follows:

The compound growth rate calculated on an annual basis is called average annual compound growth rate, the compound growth rate calculated on a monthly basis is called average monthly compound growth rate.

The following query counts the compound growth rate of average monthly sales of different products since January 2018:

WITH s (product,ym,amount,first_amount,num) AS (
  SELECT m.product, m.ym, m.amount,
  FIRST_VALUE(m.amount) OVER (PARTITION BY m.product ORDER BY m.ym),
  ROW_NUMBER() OVER (PARTITION BY m.product ORDER BY m.ym)
  FROM sales_monthly m
)
 
SELECT product AS "产品", ym AS "年月",amount AS "销售额",
       (POWER( amount/first_amount, 1.0/NULLIF(num-1,0)) -1)*100 AS "月均复合增长率(%)"
FROM s
ORDER BY product, ym

First defines a general table expression, in which FIRST_VALUE (amount) returns For the sales volume of the first period (201801), the ROW_NUMBER function returns the number of each period.

The POWER function in the main query is used to perform square root operations, the NULLIF function is used to handle division-by-zero errors in the first period of data, and the constant 1.0 is used to avoid precision loss caused by integer division.

The results returned by this query are as follows:

January 2018 is the first period, so the compound growth rate of its product monthly average sales is empty.

“桔子”2018年2月的月均销售额复合增长率等于它的环比增长率,2018年3月的月均销售额复合增长率等于0.4471%,依此类推。 

4.不同产品最高和最低销售额

以下语句统计了不同产品最低销售额、最高销售额以及第三高销售额所在的月份:

  SELECT product AS "产品", ym AS "年月",amount AS "销售额",
  
         FIRST_VALUE(m.ym) OVER (
           PARTITION BY m.product ORDER BY m.amount DESC
           ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
         ) AS "最高销售额月份",
         
         LAST_VALUE(m.ym) OVER (
           PARTITION BY m.product ORDER BY m.amount DESC
           ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
         ) AS "最低销售额月份",
         
         NTH_VALUE(m.ym,3) OVER (
           PARTITION BY m.product ORDER BY m.amount DESC
           ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
         ) AS "第三高销售额月份"
 
  FROM sales_monthly m
  ORDER BY product, ym;

三个窗口函数的OVER子句相同,PARTITION BY选项表示按照产品进行分区,ORDER BY选项表示按照销售额从高到低排序。

以上三个函数的默认窗口都是从分区的第一行到当前行,因此我们将窗口扩展到了整个分区。

该查询返回的结果如下: 

“桔子”的最高销售额出现在2019年6月,最低销售额出现在2018年1月,第三高销售额出现在2019年4月。

示例表和脚本

-- 创建销量表sales_monthly
-- product表示产品名称,ym表示年月,amount表示销售金额(元)
CREATE TABLE sales_monthly(product VARCHAR(20), ym VARCHAR(10), amount NUMERIC(10, 2));
 
-- 生成测试数据
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201801',10159.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201802',10211.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201803',10247.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201804',10376.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201805',10400.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201806',10565.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201807',10613.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201808',10696.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201809',10751.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201810',10842.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201811',10900.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201812',10972.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201901',11155.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201902',11202.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201903',11260.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201904',11341.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201905',11459.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('苹果','201906',11560.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201801',10138.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201802',10194.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201803',10328.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201804',10322.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201805',10481.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201806',10502.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201807',10589.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201808',10681.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201809',10798.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201810',10829.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201811',10913.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201812',11056.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201901',11161.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201902',11173.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201903',11288.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201904',11408.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201905',11469.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('香蕉','201906',11528.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201801',10154.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201802',10183.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201803',10245.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201804',10325.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201805',10465.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201806',10505.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201807',10578.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201808',10680.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201809',10788.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201810',10838.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201811',10942.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201812',10988.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201901',11099.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201902',11181.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201903',11302.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201904',11327.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201905',11423.00);
INSERT INTO sales_monthly (product,ym,amount) VALUES ('桔子','201906',11524.00);

推荐学习:《SQL教程

The above is the detailed content of Detailed explanation of SQL window functions: the use of ranking window functions. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:jb51.net. If there is any infringement, please contact admin@php.cn delete