Storing multivalued issues is a common problem when designing a database, and seeing a lot of developers eat a deficit on it, I think it's necessary to come up with it.
Business Scenario: A business document with multiple contacts. One equipment maintenance work, there are several maintenance teams. Let's give an example.
Create Table BILL
(
bill_id Number Primary Key ,
Bill_name varchar2 (),
bill_content varchar2 (+),
contact_id Number -- to the user_id of the user watch
);
1. In the initial design, there was only one contact, and then there was a change in demand, and there were multiple contacts. There are several scenarios:
Scenario One: In addition to several fields, contact_id1,contact_id2,contact_id3 ....
Scenario Two: contact_id the number type is changed to VARCHAR2, multiple values are stored together, and values are separated from each other by a delimiter (such as a comma).
Scenario Three: Add a table Bill_contact
Create Table bill_contact
(
bill_id Number ,
contact_id Number
);
Alter Table bill_contact
addconstraint pk_bill_contactprimarykey (bill_id, CONTACT_ID);
2. Compare several scenarios
Scenario one is obviously inappropriate, do not know how to build a few fields appropriate, even if you know a few contacts, the query is also very troublesome. Records with contacts 100 and 101 are included in the enquiry form.
Select * from bill_contact
where (contact_id = contact_id1 =101)
or (contact_id =101 and contact_id1 =100 );
The query contains a record of contact 100,
Select * from bill_contact
where (contact_id =or contact_id1 =101 or ... .. . );
The advantage of scenario two is that it is convenient for developers to change a small amount of code, which is commonly adopted by developers. A. However, the analysis and statistical functions are very difficult to do, such as the need to list all the documents for a certain contact in a certain period of time, and the number of contact persons for each document.
B. Queries can also become ineffective, with type inconsistencies leading to implicit conversions and index invalidation.
C. The changes are complex and require additional logic processing in the code.
D. Some system primary key with 32-bit UUID, if the contact and 10 bits, then this field length is 500, a bit scary.
Select * from bill_contact
where contact_idlike' 100,% '
or contact_id like'%,100 '
or contact_id like'%,100,% ';
I wrote about this problem earlier. Optimized scheme, multi- value processing of single field in database design
Scenario three happens to make up for a lot of the plan two, developers always worry about the performance of the table association is too poor, is actually redundant, because at this time can go to the index. Another benefit is the ability to expand the contact's information, such as the first contact, or the second contact, which is not possible in scenario two. The transformation is larger for the developer than the scenario two.
Select * from bill_contact A, bill_contact b
where a.bill_id = b.bill_id
and b.contact_id inch (101);
3. How to choose a multi-valued problem?
The plan must not be chosen.
Scenario two is suitable for multi-valued columns without analysis statistics, no queries.
Plan three is the ideal plan in my heart, although it may cause some workload.
The problem of storing multi-value in database design