现在正在做 REST API 的设计,在设计过程中遇到了一些困惑,问题是这样的:
比如我有一个订单表
order_id | products | status |
---|---|---|
1 | 笔记本电脑 | canceled |
2 | 华为手机 | finished |
3 | 小米手环 | delivering |
获取所有订单的接口设计如下,其中有一个可选参数是 status
可选值为 canceled
、finished
、delivering
。
GET /order?status=canceled //已取消的订单
GET /order?status=finished//已结束的订单
GET /order?status=delivering //配送中的订单
这样设计API,可读性还是很好的。但对这里的 status 字段有一些疑问,该字段在数据库中,是直接存储为字符串?还是存储为数字?
由于需要保证API可读性好,所以如果用 数字来存储 status ,那么就需要管理字符串与数字之间的对应关系。
比如说:canceled => 0
、finished => 1
, delivering => 2
但是这样子,就需要在程序上,对status做一些转换,会无形增加一些程序的复杂性。
我的问题是:能不能直接把 status
字段直接存储为字符型?,就像上面的设计一样。如果数据量大会不会造成性能问题?或者其他一些问题? 谢谢!
It usually looks like this statusIt is usually a number arranged in order
For example
0 - Order generated
1 - Out of stock
2 - Shipped
3 - Delivering
4 - Signed for receipt
5 - Evaluated
6 - Canceled
This way you can check the unfinished order through status<4
The above is just an example. You should also If there are other statuses such as returns, payments, refunds, etc.
this is better
I have designed some databases, similar to this situation. The earliest ones used int to store, because the performance is high compared to varchar int, but the problem that comes with it is, so the business problems will There are often queries, but when writing SQL, I often forget the status corresponding to the corresponding number. It is very annoying to find it every time. In addition, I learned from a corner of the Internet that the performance of int and varchar in postgreSQL is actually There is actually no difference in the problem (it has not been verified, at least I have not verified it). After thinking about it, even if the maximum number of characters stored in the varchar state is limited to ten characters, there will not be a big problem, and it is similar to the state Such frequently queried fields will also create an index, which is basically enough.
I heard a senior say before that he has a library with a lot of data, and the query speed of int is faster than that of varchar.
Currently, the status field has been changed to int type. The disadvantage of varchar being easy to identify is that when writing SQL statements, more status conditions need to be added (where stat='open'). That's right for me. Lazy.. People are lazy, and it may bring about performance improvements. Why not? As mentioned before, if you can’t remember the state, you will lose something if you gain something.
And if a state like this is stored Int type, if converted, you can define an enumeration class as the corresponding status value in Java, so that the program judgment will be relatively simple.
Regarding the problem of large data, if your table data does not reach hundreds The difference between int varchar and tens of millions of data may not be very big. When the data reaches or exceeds this amount, the database may not be able to operate efficiently, and more efficient methods will be considered, such as nosql solr etc.
Conclusion:
In order to facilitate your understanding, use varchar
In order to be efficient (or professional design), use int
If you are sure to use digital storage, you can consider using tinyint, which is relatively more efficient. The numerical range of int is smaller than int, but it will not work if it is postgresql. The database does not support the tinyint type
Although int is not that easy to understand, it is much more efficient, and it is also helpful for comparison with what is said above. Readability can be solved using enumeration-like methods. If the string parameter passed by the front end can be converted into an int value using a hash table, the database query can be performed.
The correspondence between values and status must be standardized in documents. Just check the documents when writing code to facilitate collaborative development between different developers. There is no standardized document in this regard, and even if you use character type, not everyone can understand it.
One thing to mention, considering that the status may increase in the future, it is recommended not to take continuous values (i.e. "1 2 3 4 5"), something like "10 20 30 40 50" will allow you to insert when adding status in the future.
In fact, this can be handled with ORM
The query generally looks like this
In this case, the most appropriate method is to use an enumeration type
Since we use mysql, why no one mentions the enum type?
The enum appears as a string but the underlying storage uses numbers, and it also has a natural input check, which is the best of all three solutions
MySQL supports enum type, and the actual storage is tinyint. Both readability and efficiency.
Let me give you an example in your situation:
First look at the settings of mysql
The terminal only needs to access by string. ORM also has good support for enumerations, taking java+jpa as an example.
I also use the method on the first floor, but I will define the status of the last canceled or invalidated order as a negative number. It will be more convenient to query the valid orders. As for coding, you can use enumerations to make the program easier to read.
I think it is better to define an enumeration and store the string type directly in the database
By doing this, your restful API can be designed like this:
It is recommended to store strings directly, which is no different from numbers. For applications, they are all enumerations, but they are more readable and do not need to maintain any mapping relationships.