Caching reporting data in the same transaction database vs using a data warehouse
P粉511749537
P粉511749537 2024-02-26 19:20:23
0
1
403

We have a SaaS solution where each tenant has its own MySQL database. Now I'm designing a dashboard for this SaaS system and it requires some analytical charts. To obtain the data required for the chart, we can query each tenant's transaction data in real time from its database. and get updated graphs without bad performance since the data volume is not that big so far. However, since the data volume will continue to grow, we decided to separate the analytical data and transaction data for each company. We will fetch the analytical data of the charts in the background, save/cache them and update them regularly. my question is:

  • What are some good questions or factors we should consider before deciding whether we need to include data warehousing and data modeling from the start, or just cache the analytical data for API-generated charts in a JSON column in a new table? Chart in the tenant's MYSQL database.

P粉511749537
P粉511749537

reply all(1)
P粉759451255

Instead of going into millions of rows in a "fact" table, build and maintain a summary table and then get the data from it. It may run 10 times faster.

This does require a code change due to the extra tables, but it may be worth it.

Summary Table

In other words, if the data set becomes larger than X, a summary table is the best solution. Caching won't help. Hardware is not enough. JSON just gets in the way.

Constructing a year's worth of graphs based on a year's worth of data points (one per second) is slow and wasteful. It makes much more sense to construct a year's chart based on daily subtotals.

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template