Distributed | data | database
With the rapid development of traditional database, computer network and digital communication technology, the research and development of distributed database system, which is mainly characterized by data distribution storage and distribution processing, has attracted more and more attention. But because of its complicated development, it restricts its development to a certain extent. Based on this, this paper proposes in the. NET environment, a new development language C # combined with Ado.net data access model is used to develop distributed database system, which simplifies the development process greatly.
1 Distributed Database System
In essence, the data of distributed database system is logically unified, but it is scattered in physics. It has the following main benefits compared to a centralized database:
· Address the problem of disparate organizational structures and the need for data to be interconnected.
· Balanced load. The load is shared among the processors, and the critical bottleneck can be avoided.
· High reliability. The data are distributed in different venues and there are multiple copies, even if the individual sites fail to cause the entire system to be paralyzed.
· Extensibility is good. When new, relatively autonomous organizational units need to be added, they can be augmented with minimal impact on the current institution.
Although distributed database system has many advantages, it also brings a lot of new problems. Such as: Data consistency problem, the realization of remote data transmission, the reduction of communication cost, etc., which makes the development of distributed database system become more complicated. Fortunately, Microsoft's. NET development environment provides us with the C # development language and Ado.net data access model, which can greatly simplify the development work by combining the two to develop the distributed database system.
2 Remote processing framework and Ado.net
Two important problems need to be solved in developing distributed database system are: Data communication between each site and operation and management of database. Using C # combined with ado.net can efficiently and reliably solve these two problems. The specific performance is, in C # by using. NET Remoting framework can easily solve the problem of data and command remote delivery; C # operates the database through Ado.net, which makes the operation of database in distributed database system more efficient and reliable, and it is easy to solve data consistency problem.
2.1. NET Remote processing framework
There are three different ways to implement remote delivery of data and commands. The first is the use of messages or messages, the data to be transferred to the stream format, and then through the socket programming in the form of messages sent to the remote host. This kind of method is troublesome, not easy to realize. The second is to use Web service, the Web service that each remote host provides a database query service. This method can only query a single site, can not achieve a multiple site joint query. The third is the use of. NET Remoting Framework (. NET Remoting framework) technology, which hides the technical details of a remote invocation, and the service program can turn the local object into a remote object for remote service by simply setting it up, and the client can access the remote object as transparently as the local object , all messages, packets and so on to the. Net Remoting object processing, greatly simplifies the development. The general process of remoting is shown in Figure 1:
Figure 1 Remote processing process
First, the server side creates an instance of the server class, and the remoting system creates a proxy object that represents the class and returns a reference to the client object to the proxy. When a client invokes a method, the remoting infrastructure connection checks the type information and sends the call over the channel to the server process. Listening to the request and forwarding it to the server remoting system, the server remoting system looks (or creates it if necessary) and invokes the requested object. This process is then reversed, and the server remoting system bundles the response into a message and is sent by the server channel to the client channel. Finally, the client remoting system returns the result of the call to the client object through the proxy.
2.2 Ado.net
Ado. NET, with XML as its core, is a solution for. NET database applications. It uses an offline data structure in which data from the data source is cached in a DataSet object, where the user does not have to lock the data source and the data is saved in XML format.
2.2.1 ADO. NET Management Data consistency
In the distributed database system, it is very likely that multiple users can access and modify the data at the same time, therefore, for the distributed database system, data consistency is indispensable. Ado. NET controls data consistency by using optimistic consistency schemes (in fact, the DataSet object is designed to support the use of optimistic consistency control), where data rows are locked only when they are actually updated in the database, and in pessimistic consistency scenarios, The data rows are locked from the time they are extracted to the update in the database. Therefore, using ado.net can respond to a large number of users in less time.
In addition, in a distributed database system, it is often encountered that the user modifies a row that has been modified since it was extracted, violating the principle of consistency. Ado.net also solves this problem by using a DataSet object to maintain two versions for each modified record: original and newer versions, the original version of the record in the dataset is compared to the current version in the database before the updated record is written back to the database, and if two versions match, update in the database Otherwise, there will be errors that violate the principle of consistency.
3 Instance Development
A home appliance chain has a headquarters and many branches, headquarters and branches and branches often need to carry out a variety of information inquiries (such as: Commodity price list, store sales status and inventory information, etc.), the organization set up a distributed database query system, can achieve headquarters and store information sharing, easy to unified management.
3.1 System Design
3.1.1 System structure diagram
The system structure is shown in Figure 2:
Figure 2 System Structure diagram
Headquarters and branches are equipped with a fixed IP server, other computers through the hub to connect to the server, the headquarters and branches of the server through the communication network.
3.1.2 System Implementation Steps
The system implementation is divided into three main steps. First, the database is designed for headquarters and branches. Because of the large amount of data, SQL Server is used to create a sales and inventory database for each branch, as well as an employee database for Headquarters, an inventory database for the entire chain store, a credit card customer database, and a database of vendor information. Second, you need to create a dynamic-link library (DLL) that provides database services (DBServer), some services (such as publishing and fetching remote objects), and functions that will be used in queries (such as queries for local offsite data tables, remote creation and deletion of data tables, The connections and merges between tables, etc.) are placed in the DLL, which is required by each branch to make calls to some services and functions at the time of the query. Finally, the client query interface is developed according to the actual needs.
The key technology of 3.2 system realization
3.2.1 Remote object's publication and acquisition
The first thing to do after the system runs is to publish local remote objects and get the remote objects that are published by other stores. When you publish a remote object, you first set a network port number, then create and register a channel, and finally publish the activation object for that server. Servers in other venues can easily obtain the published remote objects based on the IP address and network port number. The key code to implement remote object publishing and acquisition is as follows:
Publication of remote objects:
Creates a channel instance, port is the specified network port number
TcpChannel mychannel= New TcpChannel (Int32.Parse (port));
Register Channel
ChannelServices.RegisterChannel (MyChannel);
Publish this server-side activation object
RemotingConfiguration.RegisterWellKnownServiceType (typeof (DBServer), "STORE", Wellknownobjectmode.singleton);
Access to remote objects:
Get the appropriate remote object based on the IP address and port number
Try
{mydbserver= (DBServer) Activator.GetObject (typeof (DBServer), "tcp://" +ip+ ":" +p+ "/store");}
Catching exceptions
catch (NullReferenceException Nullexp)
{MessageBox.Show ("The specified URL address is unreachable" + nullexp.message);}
catch (RemotingException remexp)
{MessageBox.Show ("specifies that the obtained object definition is not correct" + remexp.message);}
Access to the 3.2.2 database
By ado.net access to the database, you can easily connect to the database, import data from the data source into the DataSet object, perform various operations on the data table in the DataSet object, and the DataSet object itself can be passed remotely. This brings great convenience to the development of distributed database system. The key code to implement database access is as follows:
Establishing a connection to a database
String sqlconn = "Initial catalog=store;data source=localhost; Userid=sa; password=; ";
SqlConnection conn= New SqlConnection
(sqlconn);
Conn.Open ()//Open Database
To import data from a data source into a DataSet object
try{
DataSet ds = new DataSet ();
DataTable dt=new DataTable ("result");
SqlDataAdapter Adapter=new SqlDataAdapter ();
SqlCommand Mysqldatasetcmd =new SqlCommand
(cmdstring,conn);//cmdstring for the command to be executed
Adapter. Selectcommand= Mysqldatasetcmd;
Adapter. Fill (DT);
Ds. Tables.add (DT); }
Finally
{conn.close ();//Close Database connection}
3.2.3 Query
The query in distributed database system is generally divided into three categories: Local query, remote query and federated query. There is no difference between a local query and a centralized database query; For remote queries, the query function can be conveniently implemented as long as the remote object is fetched, and the most complicated is the joint query, which involves the query of data between multiple sites, the remote creation, transmission, connection and merging of tables. The following example describes the implementation of a federated query.
The second chain is to check the supply of all the Beijing suppliers in the nearest third and fourth chain for the inventory information of the air-conditioning, which can be realized through the following steps. First, get the remote objects published by the headquarters and the third and fourth chain stores. Next, create a temporary data table T1 from the remote object at the head office, store all the information about the suppliers in Beijing in the T1 table (the stores have only the supplier name, do not know its location, only the headquarters have the supplier's details), then save the T1 table to the third and fourth chain stores. Then let the T1 table be connected with the inventory table of two stores respectively, find out all the air conditioning inventory information supplied by Beijing suppliers (such as air conditioner name, model, number, price, etc.), and return the connection result T2 and T3 datasheet to the second chain store. Finally, merge the T2 and T3 tables and display them using the DataGrid control. In the above implementation, including the data table between the different sites of replication, transmission, connection, and so on, some of the functions used (such as: remote creation of data tables, tables and tables remote connections, mergers, etc.) are placed in the DLL, can be easily invoked.
4 concluding remarks
Using C # 's. Net remoting technology can easily solve the problem of data communication between different venues. In addition, C # through the Ado.net access to the database, making the database operation and management become more efficient and reliable. The use of these two technologies can effectively solve the problems of developing distributed database system, greatly reduce the workload of system development and improve the reliability and security of the system.