用传统的LIMIT m, n
做分页查询需要这么几步:
用SELECT COUNT(*) FROM table WHERE condition ORDER BY ...
查到总数并算出有多少页;
用SELECT columns FROM table WHERE condition ORDER BY ... LIMIT 0, 100
显示第一页(假设每页有100行),如果用SQL_CALC_FOUND_ROWS
这个参数的话,可以跟前一条合并成一条SQL;
点“下一页”时,用SELECT columns FROM table WHERE condition ORDER BY ... LIMIT 100, 100
查;
...
这样做往往花费很大,因为WHERE condition
有可能是全表扫描。如果MySQL没开缓存的话,每翻一页可能非常慢。
因此我就用一种新的办法:
用SELECT id FROM table WHERE condition ORDER BY ...
得到所有相符的ID,如果数据量太大(比如表中有1,000,000行),我们就限制一下行数(比如限制最多查10,000,就用LIMIT 10000
),于是这些ID就通过动态页面或Ajax(以JS代码或JSON的形式)被传到了前端;
前端JS选取前100个ID作为第一页,发送一个带这100个ID的查询请求,后端其实处理SELECT columns FROM table WHERE id IN (id_0, id_1, ..., id_99)
这么一个查询;
点“下一页”时,查询是SELECT columns FROM table WHERE id IN (id_100, id_101, ..., id_199)
;
...
这种方法只需要做一次条件查询(慢),列表数据其实都是主键查询(快)。
我在一个业余项目中用了这个办法,详见:(http://) www.chess-wizard.com/base/ (第一页数据被写在JSP页面里,有利于SEO).
我要求团队成员都用这种方式来处理分页,他们却并不认同 :-(
难道LIMIT m, n
是分页查询的标准做法或唯一途径吗?
A
id>$id limit $limit
; pass parameter $offset,$limit=100;
id>$id limit $limit
;传递参数$offset,$limit=100;
第一页:
$offset = 0
select id ,name from table order by id limit $limit;
第二页:$offset为第一页返回的id
First page:select id ,name from table where id>$offset order by id limit $limit;
$offset = 0
🎜 🎜select id ,name from table order by id limit $limit;
🎜 🎜Second page: $offset is the id returned from the first page🎜 🎜select id ,name from table where id>$offset order by id limit $limit;
🎜The reason why paging is slow is mainly because the first step is slow (retrieving records that meet the conditions, sorting, and selecting rows on the current page). The method you mentioned has not been improved in this step.
In another case, a small number of tables can be used to retrieve qualified records in the first step, but obtaining detailed row data needs to be associated with multiple other tables. At this time, if the execution plan selected by the database is incorrect, it will be very slow. At this time, you can use @abul's method to first remove the record IDs that meet the conditions from the small table, and then associate them with other tables.
Note that these processes are completed within the database, and there is no need to transmit data to the front end. There are several main reasons:
1. If the amount of data in the result set that meets the conditions is large, it will be very expensive to query all the IDs in the database and transmit them across the network. , the maximum limit of 10,000 you mentioned may not satisfy all scenarios.
2. Many times users will only read the content of the first few pages, and the consumption of taking out all the IDs at once is actually a waste.
3. If the data changes between the first and second queries, the result set obtained by the user will be inaccurate. It is necessary to judge whether it is feasible based on the accuracy requirements of the query results.
In addition, if the queried data has high publicity, you can consider putting it in a cache similar to redis to reduce the overall load of the system. If it is only placed on the front end, the reuse rate will be too low.
If you have to say an absolutely correct principle, it is actually correct nonsense: make judgments and choices based on the needs of the business scenario and the advantages and disadvantages of each solution.
1.mysql why not enable caching
2. Does the front end use synchronous or asynchronous access to page content?
If it is synchronous, then your method cannot meet the needs of the front-end;
If it is asynchronous, your query method will obtain all eligible ids for the first time (assuming that 10,000 ids are obtained first), how can these 10,000 ids be used by the front-end? Get? If they are all placed on the page, the front-end js can directly use this array to initiate an asynchronous request, but what if the jump page number exceeds this range? The front-end definitely needs to request the page number id. At this time, your where query still needs to be used
So the current effect of this plan is not that obvious.
I don’t know if my analysis is correct or not, please use it as a reference.
Since your idea is to cache IDs on the front end, why not just cache fields other than IDs on the front end as well
For example, if you want to fetch the first 10 pages of data, with 50 items per page, then use LIMIT 500 in the SQL statement to fetch the first 500 items. There is no need for any request to turn pages within these 10 pages; until the next page is turned, then Use the SQL statement of LIMIT 500,500 to retrieve the last 500 items
Is it possible to try a combination of the two? When limiting, only the id is taken out, and the specific fields are associated with each other.
This really depends on the project:
1. If the amount of data is large and there is only one unique index ID, then the method you proposed later will definitely be the fastest (of course the number of items cannot be too many)
2. If there are other fields that are indexed and 100% of the fields must be used as conditions, of course, using ordinary
ORDER BY ... LIMIT m, n
paging query will be very fast3. Regarding whether to use cache, it also depends on the application scenario. If your query does not involve too many where conditions and the data is not updated in real time, this can be used