java - mysql single table pipeline 500W data multi-dimensional statistics solution-PHP Chinese Network Q&A

Now there is a flow table structure in the project as follows
id sdkVersion jarVersion countryCode imei createTime

The previous requirement was to find the total number by grouping sdkVersion, jarVersion, countryCode and the total number after imei sorting. The approximate sql is as follows:

select sdkVersion,jarVersion,countryCode,count(*),count(distinct imei) from xxx where createTime = 'xxxx-xx-xx' group by sdkVersion,jarVersion,countryCode

Find out all the data of the previous day and summarize it into a table. The structure is roughly as follows

id sdkVersion jarVersion countryCode count(*) count(distinct imei) createTime

The current requirement is to query the combination of any latitude, that is,
group by sdkVersion
group by jarVersion
group by countryCode
group by sdkVersion, countryCode
and so on. Combination, if we follow the previous daily summary plan, we will have to create many tables for different latitude combinations. Is there any good solution to solve this problem? Or can it be solved using a specialized statistical framework?

reply all (4)

仅有的幸福2017-05-19 10:09:03 4 floor

You can check out the PipelineDB streaming database

Like+0

Add Reply

黄舟2017-05-19 10:09:03 3 floor

apache kylin, sub-second olap

曾经蜡笔没有小新2017-05-19 10:09:03 2 floor

For daily summary, real-time requirements are not high, and 500W records are still within the processing range. View + scheduled plan can meet the requirements, and there is no need to build multiple tables.
It is best for the questioner to explain what bottlenecks or pain points there are. After all, mysql is a mature product, and there are certain risks in switching to cutting-edge technology.