The number of daily active drivers has continued to increase since its inception
P粉434996845
P粉434996845 2024-03-30 22:52:01
0
2
318

I have a list of drivers, orders and dates for the period January 1, 2022 to January 15, 2022 (15 days) in a table named all_data as shown below:

driver_id order_id Order Date
1 one 2022-01-01
1 b 2022-01-02
2 c 2022-01-01
2 d 2022-01-03

Within these 15 days, how do I find the number of continuously active drivers who have completed at least one order per day as of that date? The output should be a table like this:

Order Date active_drivers
2022-01-01 30
2022-01-02 27
2022-01-03 25

For example, on January 1, 2022, 30 different drivers completed at least one order that day. On January 2, 2022, we have to find the number of unique drivers who fulfilled at least one order on January 1, 2022, and January 2, 2022. On January 3, 2022, we must count drivers who completed at least one order on January 1, 2022, January 2, 2022, and January 3, 2022.

What I tried

I found a similar solution in MySQL (below), but it is not allowed in bigquery because of the error "Unsupported subquery with table in join predicate".

MySQL
SELECT order_date, 
 (SELECT COUNT(distinct s1.driver_id) as num_hackers 
   FROM all_data s2 
   join all_data s1 
   on s2. order_date = s1. order_date and     
     (SELECT COUNT(distinct s3. order_date) 
      FROM all_data s3 
      WHERE s3.driver_id = s2.driver_id 
        AND s3. order_date < s1. order_date) 
     = datediff(s1. order_date, date('2022-01-01'), day)
))
from all_data

I've also read this Google BigQuery: Rolling Count Distinct question, but that's for a fixed 45 days, whereas the number of days here is a date-based variable. How to write a query in BigQuerySQL to find the rolling number of drivers that are continuously active per day?

P粉434996845
P粉434996845

reply all(2)
P粉757556355

First find all combinations of date and driver, then get the count of all drivers for each date. Try this:

select order_date, count(*)
from(
    select order_date, driver_id, count(*)
    from all_data ad
    group by order_date, driver_id)
group by order_date
P粉439804514

Consider the following

select order_date, count(distinct if(flag, driver_id, null)) active_drivers
from (
  select order_date, driver_id, 
    row_number() over(partition by driver_id order by order_date) - 
    date_diff(order_date, min(order_date) over(), day) = 1 as flag
  from (select distinct order_date, driver_id from all_data)
)
group by order_date
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!