I don't have much experience in table design. My goal is to create one or more product tables that meet the following requirements:
Supports multiple products (TV, mobile phone, PC...). Each product has a different set of parameters, for example:
The phone will have color, size, weight, operating system...
PC will have CPU, HDD, RAM...
The parameter set must be dynamic. You can add or edit any parameters you like.
How to meet these requirements without separate forms for each product?
@StoneHeart
I will always use EAV and MVC to get here.
@Bill Kavan
All these things you mentioned here:
In my opinion, it doesn't belong in a database at all, because no database can handle these interactions and needs at the appropriate level as the application's programming language.
In my opinion, using a database in this way is like hammering a nail with a rock. You could do this with a stone, but shouldn't you use a hammer that's more precise and designed specifically for this type of activity?
This problem can be solved by doing a few queries on part of the data and using the application to process it into a table layout. Even if you have 600GB of product data, if you need data for every row in this table, you can batch process it.
Further if you want to improve the performance of your query you can select certain operations like reporting or global text search and prepare an index table for it which will store the required data and be regenerated periodically, let's say every 30 minutes Regenerate once.
You don't even need to worry about the cost of additional data storage because it's getting cheaper every day.
If you are still concerned about the performance of the operations performed by your application, you can always use Erlang, C, Go languages to preprocess the data and then further process the optimized data in the main application. p>
You have at least the following five options for modeling the type hierarchy you describe:
Single table inheritance: One table applies to all product types, with enough columns to store all attributes of all types. This meansmanycolumns, most of which are NULL on any given row.
Class table inheritance: A product table that stores attribute types common to all products. Then one table per product type, storing attributes specific to that product type.
Specific table inheritance: A table without common product attributes. Instead, one table per product type stores common product attributes and product-specific attributes.
Serialized LOB: A product table that stores attributes common to all product types. An additional column stores a BLOB of semi-structured data in XML, YAML, JSON, or some other format. This BLOB allows you to store attributes specific to each product type. You can use fancy design patterns to describe this, such as Facade and Memento. But no matter what, you can't easily query a large number of properties in SQL; you have to extract the entire blob back to the application and sort it there.
Entity-Attribute-Value: A table of products, and a table that rotates attributes into rows instead of columns. EAV is not a valid design as far as relational paradigms are concerned, but many people still use it. This is the "property pattern" mentioned in another answer. See other questions on StackOverflow taggedwitheav for some pitfalls.
I wrote more about this in the presentationScalable Data Modeling.
Other thoughts on EAV: While a lot of people seem to like EAV, I don't. This seems to be the most flexible solution and therefore the best one. But, remember this mottoTANSTAAFL. Here are some of the disadvantages of EAV:
NOT NULL
).JOIN
on each attribute.EAV's degree of flexibility requires you to make sacrifices in other areas, which may make your code more complex (or worse) than solving the original problem in a more traditional way.
And in most cases, this level of flexibility is not necessary. In the OP's question about product types, it would be much simpler to create a table for product specific attributes for each product type, so at least enforce some consistent structure for entries of the same product type.
I would use EAV only if it is necessary to allowthat each rowmay have a different set of properties. EAV is overkill when you have a limited range of products. Class table inheritance would be my first choice.
2019 Update: The more I see people using JSON as a solution to the "many custom properties" problem, the less I like that solution. Even using specialJSON functions,它也会使查询变得过于复杂> supports them. Storing a JSON document requires more storage space than storing it in plain rows and columns.
Basically, none of these solutions are simple or efficient in relational databases. The whole idea of having "mutable properties" is fundamentally inconsistent with relational theory.
At the end of the day, you have to choose a solution that has the least impact onyourapplication. Therefore, before choosing a database design, you need to know how to query the data. There is no way to choose one "best" solution, as any solution may be best suited for a given application.