Home  >  Article  >  Backend Development  >  Efficiency problems and solutions to random query records of RAND() in mysql

Efficiency problems and solutions to random query records of RAND() in mysql

WBOY
WBOYOriginal
2016-07-25 09:03:431053browse
  1. #Create a specified range data table

  2. #auther: Xiaoqiang
  3. #date: 2008-03-31
  4. create table randnumber
  5. select -1 as number
  6. union
  7. select -2
  8. union
  9. select -3
  10. union
  11. select -4
  12. union
  13. select -5
  14. union
  15. select 0
  16. union
  17. select 1
  18. union
  19. select 2
  20. union
  21. select 3
  22. union
  23. select 4
  24. union
  25. select 5

  26. < ;p>#Get random numbers
  27. #auther: Xiaoqiang (divineer)
  28. #date: 2008-03-31
  29. select number
  30. from randnumber order by rand() limit 1

Copy code

Advantages: Random numbers can specify a certain part of the data and do not need to be continuous. Disadvantages: When the range of random numbers is very wide, it is more difficult to create a table.

2. Use MySQL’s ROUND() plus RAND() function to implement

  1. #One sql statement to do it
  2. SELECT ROUND((0.5-RAND())*2*5)
  3. #Note
  4. #0.5-rand() can get random numbers from -0.5 to +0.5
  5. #( 0.5-rand())*2 can get random numbers from -1 to +1
  6. #(0.5-rand())*2*5 can get random numbers from -5 to +5
  7. #ROUND((0.5-RAND( ))*2*5) can get random integers from -5 to +5
Copy code

However, then I checked the official manual of MYSQL, and the hint for RAND() in it probably means, in ORDER The RAND() function cannot be used in the BY clause because it will cause the data column to be scanned multiple times. But in MYSQL version 3.23, randomization can still be achieved through ORDER BY RAND(). But after a real test, I found that this is very inefficient. In a database with more than 150,000 items, it takes more than 8 seconds to query 5 pieces of data. Looking at the official manual, it is said that rand() placed in the ORDER BY clause will be executed multiple times, which is naturally very inefficient.

On the Internet, max(id) * rand() is basically queried to randomly obtain data.

  1. SELECT * FROM `table` AS t1 JOIN (SELECT ROUND(RAND() * (SELECT MAX(id) FROM `table`)) AS id) AS t2 WHERE t1.id >= t2.id ORDER BY t1.id ASC LIMIT 5;
Copy code

But this will produce 5 consecutive records. The solution can only be to query one query at a time, 5 times. Even so, it is worth it, because querying a table with 150,000 entries only takes less than 0.01 seconds. The following statement uses JOIN, which is used by someone on the mysql forum.

  1. SELECT * FROM `table` WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `table` ) ORDER BY id LIMIT 1;
Copy code

me After testing it, it takes 0.5 seconds, and the speed is good, but there is still a big gap with the above statement. I always feel like something is not normal. So I reworded the sentence.

  1. SELECT * FROM `table`
  2. WHERE id >= (SELECT floor(RAND() * (SELECT MAX(id) FROM `table`)))
  3. ORDER BY id LIMIT 1;
Copy Code

Now, the efficiency has been improved again, the query time is only 0.01 seconds Finally, improve the statement and add the judgment of MIN(id). When I first tested, it was because I did not add the MIN(id) judgment that the first few rows in the table were always queried half the time. Complete query statement:

  1. SELECT * FROM `table`
  2. WHERE id >= (SELECT floor( RAND() * ((SELECT MAX(id) FROM `table`)-(SELECT MIN(id) FROM `table`) ) + (SELECT MIN(id) FROM `table`)))
  3. ORDER BY id LIMIT 1;
  4. SELECT *
  5. FROM `table` AS t1 JOIN (SELECT ROUND(RAND() * ((SELECT MAX(id) FROM ` table`)-(SELECT MIN(id) FROM `table`))+(SELECT MIN(id) FROM `table`)) AS id) AS t2
  6. WHERE t1.id >= t2.id
  7. ORDER BY t1. id LIMIT 1;
Copy code

Finally, query these two statements 10 times in PHP. The former takes 0.147433 seconds The latter takes 0.015130 seconds It seems that using JOIN syntax is much more efficient than using functions directly in WHERE. After many tests, we came to the conclusion that using join syntax is much faster than using it directly in where. Friends who have better submissions can come out and chat with others.



Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn