Two extreme implementations of multi-tenancy
Multi-tenant data layer design pattern
- Through the previous analysis we know that traditional applications, just to serve a single tenant, the database is deployed in the enterprise intranet environment, for the data owners, the data is "private", it conforms to its own definition of all the security standards. In the cloud era, as the application itself was put into the cloud, the data layer was often open, but the tenant's requirements for data security did not fall. At the same time, multi-tenant applications face more performance stress than single-tenant applications in the case of an increase in the number of tenants. So how does a multi-tenant framework in the data layer make trade-offs between sharing, security, and performance.
Three common modes
- Standalone database
- Shared database, standalone Schema
Shared databases, shared schemas, shared data tables
A standalone database is a single instance of a tenant that provides the strongest degree of separation, the tenant's data is physically invisible to each other, and backup and recovery are flexible; a shared database, independent schema Associates each tenant to a different schema in the same database, and the data between the tenants is not logically visible to each other. The implementation of the upper-level application is as simple as a standalone database, but backup recovery is slightly more complex; the last pattern is that tenant data is shared at the data table level, which provides the lowest cost, but introduces additional programming complexity (the program's data access requires tenantId to differentiate between different tenants). Backup and recovery are also more complex. The features of these three models can be summed up in a single graph:
Figure 3. Similarities and differences of three deployment models
The summary is a general conclusion, and the general scenario requires comprehensive consideration in order to decide which way is appropriate. For example, in the cost of ownership, the independent database is considered high and the sharing mode is low. However, if the potential data expansion needs of large tenants are taken into account, there may be some adverse cost-consuming conclusions.
- The choice of multi-tenant, mainly cost reasons, for most scenarios, the higher the sharing degree, the use of hardware and software resources more efficient and lower costs. At the same time, we should solve the problems of security, performance and expansibility of tenant resource sharing and isolation. After all, there are also customers who are not satisfied with sharing the data with other tenants in the shared resource.
At present, various types of data vendors in the multi-tenant support, probably follow the above-mentioned patterns, or mixed with several strategies, specific introduction, you can see the following this blog detailed introduction:
Discussion on multi-tenancy of data layer
In this article is a comprehensive introduction of hibernate and Eclipselink for multi-tenancy of the specific implementation, the end of the article also has the source can be downloaded, after a simple configuration can be normal operation. It's a great article to learn about multi-tenancy.
Learning experience
- I shouldn't have explained this too much, but I still have some ideas for you to share.
Hibernate
- First of all, the implementation of hibernate, in the article does not give the implementation of the independent database, only that this mode can be implemented by implementing the Multitenantconnectionprovider interface or inheritance Abstractmultitenantconnectionprovider class and other ways to achieve. If the reader carefully analyzes the code will find that for the shared database, the Independent schema mode is also required to implement the Multitenantconnectionprovider interface, the code is as follows:
Public class schemabasedmultitenantconnectionprovider implements Multitenantconnectionprovider, stoppable, configurable, serviceregistryawareservice
{ Private FinalDrivermanagerconnectionproviderimpl ConnectionProvider =NewDrivermanagerconnectionproviderimpl ();//Get a database connection @Override PublicConnectiongetanyconnection()throwsSQLException {returnConnectionprovider.getconnection (); }//Close database connection @Override Public void releaseanyconnection(Connection Connection)throwsSQLException {connectionprovider.closeconnection (connection); }//According to different users, use the link to the user's library @Override PublicConnectiongetconnection(String tenantidentifier)throwsSQLException {FinalConnection Connection = Getanyconnection ();Try{connection.createstatement (). Execute ("Use"+ Tenantidentifier);//focus on 1}Catch(SQLException e) {Throw NewHibernateexception ("Could not alter the JDBC connection to specified schema ["+ Tenantidentifier +"]", e); }returnConnection//Focus on 2}@Override Public void releaseconnection(String tenantidentifier, Connection Connection)throwsSQLException {Try{connection.createstatement (). Execute ("Use main"); }Catch(SQLException e) {Throw NewHibernateexception ("Could not alter the JDBC connection to specified schema ["+ Tenantidentifier +"]", e); } connectionprovider.closeconnection (connection); } ......}
- In the focus 1, it is actually using the current connection to execute the "use schema name" so that the actions we use with the link below will be manipulated under the new schema.
- Look at the focus again. 2 The return value is a connection, if it is a shared database independent schema used by the connection only need to have one, but if you want to implement isolated database isolation, in fact, only need to change the implementation of Getconnection method, It is possible to generate different connection based on the value of Tenantdentifier.
Hibernate cache under multi-tenancy
- We should also pay attention to the description of the hibernate cache under multi-tenancy in the article, in order for the reader to deepen the impression I also pasted here.
- A multi-tenant implementation based on a standalone schema pattern that does not require additional tenant_id for its data tables. With ConnectionProvider to obtain the required JDBC connection, the primary cache (Session-level cache) is secure, and the first-level cache caches the data at the level of the thing, and once the thing is done, the cache is invalidated. However, a level two cache in this mode is unsafe, because the primary key of a database of multiple schemas may be the same value, which makes Hibernate unable to use the level two cache to hold objects properly. For example, there is an ID of 1 in the guest table for Hotel_1, and there is an ID of 1 in the guest table of Hotel_2. Usually I overwrite the hashcode () method of the class based on the ID, so if you use a level two cache, you cannot distinguish between the guest of Hotel_1 and the guest of hote_2.
- Caching in the mode of shared data tables can use both the first level cache and the level two cache of hibernate, because the primary key is unique in the shared data table, each record in the data table belongs to the corresponding tenant, and the objects in the level two cache are unique. Hibernate provides a built-in cacheprovider implementation for cache plug-ins such as EhCache, Oscache, Swarmcache, and JBossCache, and allows readers to select the appropriate cache for their needs and modify hibernate Profile settings and enable it to improve the performance of multi-tenant apps.
Eclipselink
- For Eclipselink, he fully implements the JPA standard, which also provides a guarantee for our enterprise-class applications, and in the next article we will explain this in detail.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Follow me to the "cloud" side (ii) data isolation for multi-tenancy