Recently, I encountered a problem caused by a large amount of data. Currently, the data volume is about 8 Mb. In the future, nearly of data will be added every day. Therefore, the partition table feature of MSSQLServer is considered.
The original table design is equivalent
Table Name: userevent
ID (bigint), userid (nvarchar (250), eventid (nvarchar (250), column1, column2, column3.
This table is equivalent to an index table for all the services of the system. It records all the business behaviors of a user in this system.
Application scenarios:
Each time a user has a business in the system, a data record is recorded in the table.
Query page: You can use userid or eventid, but you must query one of the two data.
MSSQLServer only supports the Partition Table policy for one column. However, in the current application scenario, you must have two columns of information (userid and eventid,
Partition columns must also contain the information of these two columns. Therefore, consider adding one column:
ID (bigint), sequenceid (INT), userid (nvarchar (250), eventid (nvarchar (250), column1, column2, column3
However, the problem arises. Users may only consider entering userid or eventid. In this case, the amount of information to locate the partition table is insufficient and the characteristics of the partition table cannot be effectively used.
There is always no answer to this question. Consider encoding sequenceid.
The tentative sequenceid is 0 ~ 99 digits, 0 to 10 digits ~ 9 indicates an encoding of eventid, with 0 ~ 9 indicates an encoding of userid. This encoding can be obtained through hashcode,
And then obtain the bit. (For this encoding, it depends on the Application Scenario, otherwise it will not achieve good hash effect)
In this case, if you only enter userid (the resulting code is 6), we can add the conditions (sequenceid in (6, 16, 26, 36, 46, 56, 66, 86, 96 ))
Do you have any suggestions.