I'm thinking about how to represent complex structures in a SQL Server database.
Consider an application that needs to store details about a series of objects that share some properties but have many other uncommon properties. For example, a business insurance package might include liability, motor, property, and indemnity coverage in the same policy record.
Achieving this in, for example, C# is simple, as you can create policies that contain collections of parts, with parts inherited as needed for various types of overrides. However, relational databases don't seem to allow this.
I can see there are two main options:
Create a strategy table and then a partial table with all the fields required for all possible variations, most of which are empty.
Create a policy table and multiple partial tables, each table corresponding to a type of insurance.
Both alternatives seem unsatisfactory, especially since the query would need to be written across all parts, which would involve either a lot of joins or a lot of null checks.
What are the best practices for this scenario?
The third option is to create a "Policy" table and then a "SectionsMain" table to store all the fields that are common across different types of sections. Then create additional tables for each type of section, containing only the uncommon fields.
Deciding which one is best depends mainly on how many fields you have and how you want to write your SQL. They will all work. If you only have a few fields, then I'd probably go with #1. For "a lot" of areas, I'd lean towards #2 or #3.
@Bill KarwinIn hisSQL Antipatternsbook, he proposed the SQLEntity Attribute Valueanti-pattern. Here's a brief overview:
Single table inheritance (aka per-hierarchy table inheritance):
Using a single table like the first option is probably the simplest design. As you mentioned, many subtype-specific properties must be given NULL values on rows where these properties do not apply. Using this model, you would have a policy table that looks like this:
Keeping the design simple is an advantage, but the main problems with this approach are as follows:
As you add new subtypes, you must change the table to accommodate the properties that describe these new objects. This can quickly become a problem when you have many subtypes or when you plan to add subtypes regularly.
The database will not be able to enforce which properties apply and which do not, because there is no metadata to define which properties belong to which subtypes.
You also cannot enforce
NOT NULL
on a subtype attribute that should be enforced. You have to handle this in the application, which is usually not ideal.Specific table inheritance:
Another way to solve the inheritance problem is to create a new table for each subtype, repeating all the common properties in each table. For example:
This design will basically solve the problems identified by the single-table approach:
Mandatory attributes can now be enforced via
NOT NULL
.Adding new subtypes requires adding a new table, not adding columns to an existing table.
There is also no risk of setting inappropriate attributes for specific subtypes, such as the
vehicle_reg_no
field of an attribute policy.No need for the
type
attribute like in the single table method. The type is now defined by metadata: table name.But this model also has some shortcomings:
Public properties are mixed with subtype-specific properties, and there is no easy way to identify them. The database doesn't know either.
When defining a table, you must repeat the common properties for each subtype table. This is definitely not干.
Searching for all strategies regardless of subtype becomes difficult and requires a bunch of
UNION
.Regardless of type, you must query all policies via:
Please note that adding new subtypes will require modifying the above query with additional
UNION ALL
for each subtype. If you forget to do this, you can easily cause errors in your application.Class table inheritance (aka table inheritance per type):
This is the solution mentioned by @David in another answer . You create a table for the base class that includes all public properties. You would then create specific tables for each subtype, whose primary keys also serve as base tables. Example:
This solution solves the problems found in the other two designs:
Mandatory attributes can be enforced by
NOT NULL
.Adding new subtypes requires adding a new table, not adding columns to an existing table.
There is no risk of setting inappropriate properties for specific subtypes.
No
type
attribute is required.Now public properties are no longer mixed with subtype specific properties.
We can finally stay dry. Table creation does not require duplication of common properties for each subtype table.
Managing auto-increment policies for
id
becomes easier as this can be handled by the base table rather than each subtype table generating them independently.Searching all strategies (regardless of subtype) is now very easy: no
UNION
required - justSELECT * FROM strategy
.I think the class table method is the most appropriate in most cases.
The names of these three models come fromMartin Fowlera bookEnterprise Application Architecture Patterns.