Author:Chu Xianzhi
Local content
Preface
Introduction and use of data replication
How to build a data warehouse
Functions of the zookeeper agent
Category and time of use
Design a secure environment (secure replication)
Conclusion
Preface
Some friends often hope to distribute the company's internal data sources to other SQL server data sources, the reason may be that there is usually no copy of the information, hoping that the information can be distributed, "Submit" the information to another machine for the purpose of verifying the quality.
There are many opportunities in the enterprise that may need to distribute data. This article will show you how to replicate data in SQL Server.
This article will discuss the following topics:
Introduction and use of data replication
How to build a data warehouse
Describes the functions of the replication agent.
Design a secure environment (secure replication)
Introduction and use of data replication
If we do not mention the technical skills provided by the SQL Server database, we define the dynamic production of SQL Server in terms of names, in fact, I just wrote a copy of the content of a resource to B, no matter whether it is a data source or B data source, there will be a model of data.
After learning such a theory, we want to see that in practice, there are many ways to reflect the above situation, such:
(A) Periodically use DTS to route data to the second machine.
(B) Regularly copy the information from A to B.
(C) Use the optimize sequence function provided by SQL Server.
In other words, it is not only possible to use the optimize function provided by SQL Server to complete the reading. There should be many methods, similar functions can be implemented.
However, why does the dynamic route function provided by SQL Server need to be used? The main reason is the autonomy and latency considerations of data.
The primary consideration of the memory mechanism provided by SQL Server is the time difference, that is, "at the same time, the two SQL servers may not be able to see the same data. "However, the latest data can be confirmed to be consistent.
From this perspective, there are the following situations in which commercial environments that can be used:
(A) There are multiple users who need to access or access multiple data sources that need the same data sources:
This allows users on the terminal to obtain data in a convenient location, so as to avoid accessing data through a fixed or slow network.
Users with dynamic configurations (such as Pocket PC) can also use the resource when accessing images, and when the network is connected again, the dynamic resources are synchronized back to the primary resource.
(B) Conditions for Improving Efficiency
Put a large amount of information that needs to be acquired on one machine, and a large amount of information that needs to be transferred on another machine, so that users can perform row-based query and analysis, the transaction processing of the original data volume is not required.
The resources required by different business units are different, and the scattered information of the new business is required. For example, the computing department only needs the related information of the project, there is no need to scatter the information used by personnel.
(C) can be used as a standby server
In general, small enterprises do not have the money to build a collection server, but they also hope that information can be used as a machine with support, this feature can be used in this case.
How to build a data warehouse
Before creating an SQL Server 2000 example, you must first understand the basic concepts of some typical examples. We only need to make the example a keyword of a general publishing industry, please take a picture of the content first.
Example 1: a diagram of the three relationships between the two groups.
Before creating zookeeper, you must determine three roles: publisher, publisher, and subscriber ).
Publisher ):
Source data sources.
The confirm production dataset information can be used for the future operation.
Indicates all the information about the sending set on the platform.
The producer has dynamic production information.
Send dynamic information to the producer (either on the same server or on different servers ).
Issuer (distributor ):
Contains the distribution system database ).
You can store your recorded materials and/or trade and medium-volume materials.
Supports multiple publishers.
Subscriber ):
In the intent of the publisher-author (publisher-subscriber), the information is grouped into publication and article ).
If you read the famous words in the "Shopping Mall", there will be a lot of problems. The first issue is the store's old man, he can decide what kind of information he wants to offer to you, but the statement must be sold in a production region according to regulations ), guests can log on to Alibaba Cloud for renewal, or the veteran will regularly send new users to their senders using the kubernetes service.
After the role is defined, it will determine what kind of content to be released, that is, the published production and production set:
Publish (Article ):
The sending statement can be a specific part of the entire data table, data table, stored program, data table, stored program, or user-defined data.
Publication ):
A production set is a set of one or more production sets (articles. It is also the basic unit of metadata. For example, you can create a products production row set that contains a data table, stored programs, and data related to a ticket.
The sending sequence must take the sending sequence as the single bit. The basic single bit used as the sending sequence cannot be used to specify the specific sending sequence. To publish an object that references another item, you must also publish the object referenced by the object. For example, if a specific dataset table (View) is published, you must reference the dataset table as a part of the dataset.
Subtasks ):
Please submit a ticket to the Alibaba Cloud website. This program can be developed by the "publisher" or "publisher. If it is a program developed by the "publisher", it indicates that the program is pushed. If it is a program developed by the "publisher, in order to extract the cursor (pull), it is related to sending the cursor and extracting the cursor, such as Segment 2:
Issue 2: Send the producer and extract the producer.
If the website is used as a legal representative, it will be sent to the Agency as the Agency regularly sends the website to the publisher's home, the extract token means that the Creator goes to the primary store to renew his account. The selected time is as follows:
|
Push) |
Extract extract (pull) |
Who initiated zookeeper? |
Issuer |
Registrant |
Security |
High latency (required by the producer) |
Low latency (Anonymous Authentication allowed) |
Robot bot requirements |
Low latency |
High latency |
Data Volume |
Low latency |
High latency |
Table 1: Distribution ratio and extraction ratio of zookeeper.
Functions of the zookeeper agent
In fact, SQL server uses the agent to help you with the Failover mechanism. In other words, when using the Failover function, you must modify the SQL Server Agent service to make profits.
SQL Server provides three types of proxies, with all the proxies available at the bottom, as shown in table 2:
Snapshot type |
Transaction type |
Combination |
Snapshot set Agent |
Snapshot set Agent |
Snapshot set Agent |
Sending proxy |
Sending proxy |
|
|
Remember to retrieve the agent |
Combined token agent |
|
Staff member agent |
|
Table 2: proxies apply to various metrics.
Snapshot SET agent:
This agent will primarily identify each other's architecture to the receiver's machine, and inform the producer of the machines currently expected by the attacker.
Sending proxy program:
The producer periodically sends the information to the sender's machine through the sending agent.
Remember to retrieve the agent:
This agent is waiting in the log of the transaction day to check whether there are any new changes and deleted records.
Member column assistant agent:
All transactions are executed sequentially. To ensure that SQL server does not miss any transaction record, the profit column will help administrators to ensure that the final information will be delivered to their users.
Combined agent:
It can be used to deal with the dynamic operation and abrupt management.
Category and time of use
It is learned from the above that SQL server uses the agent program to regularly perform the indexing and indexing tasks of resources. However, there are three types of programming languages:
Snapshot
Snapshot, which is converted into a snapshot, just like taking a photo, and taking a photo of all the information in the current data folder, then, the original host will not be moved to another machine. The usage time of snapshot-type snapshots is when the content of the original data volume is not updated frequently, it will take a long time to send the data to the user's machine, and set the snapshot-type temporary time, you do not need to set the primary key for the data base ).
Transactional Transaction Processing
The transaction-type operator refers to the fact that the issuer's information has any transaction actions (such as adding, modifying, and deleting actions ), will be sent to the Creator's information. When a transactional snapshot is used, the initial snapshot set of the data is first sent to the receiver, the transaction record will be sent to the receiver in sequence.
The time for selecting a transaction-based middleware is as short as the delay performance of the data pipeline is minimized, the more efficient the data processing, the more stable the data processing is. However, when you set a transaction type, you must confirm that the data table has a preset primary database.
Merge Aggregate
In combination, users and publishers can modify each other's information. It seems that all mobile phones are connected, which is the same as the outlook of desktop computers. When you modify the data on your phone, you can synchronize the data to the outlook on the desktop, and modify the outlook information on the desktop, you can also synchronize the data to your phone.
When using the combined billing method, the users' information needs to be moved to each other, and users are often located in poor web environments. For example, when a business is carrying customer data, it may modify some information, and then synchronize the data to the producer.
Design a secure environment (secure replication)
Before designing the SQL server environment, first confirm the establishment of the environment. Most data cannot be used to establish the environment, it is because the security settings are not correct, and cannot be configured properly.
Before setting up the environment, we only need to think about the operation of the environment, and naturally we will understand why security is so important.
If your environment cannot be set, you may wish to optimize the Environment Settings first. For example, if the roles of the producer, the producer, and the publisher are assumed by an SQL Server, the producer is successful first, then, we will gradually increase the degree of attention, so that the attacker and the attacker can use a machine to play the role separately.
You can use the following steps to confirm whether your permission limits are correct:
The a server needs to send data to the B server. In particular, you must set the permission limit to use the agent program to send data to the B server. Therefore, the SQL Server Agent logon permission cannot be too small. If neither server a nor server B is added to the network, therefore, the logon token you set must have full permission to access c: \ Program Files \ Microsoft SQL Server \ MSSQL \ repldata category, for example, values 3 and 4.
Audit 3: log on to the console and check the logon details.
Lifecycle 4: Set the permission limit for repldata, and ensure full control of logon attempts.
If host a and host B are added to the network, the login attempts of host a will be logged on using the network region, in the same way, you must also set the security of the repldata category.
Deploy the producer and receiver you want to set up, and use the Enterprise Manager Plugin.
Next, you can use the final algorithm to set the sequence.
The configuration steps of zookeeper are as follows:
Create a producer set.
Set push or pull.
Because the steps to be done in these two actions are completed in a refined manner, and what to do in each step is important, so we use one step for one step. The difference is between five and twenty.
Tip 5: use tools to create and manage the production set.
Item 6: select the expected information to be released, and then press create release set.
This is the best practice. If you haven't finished some work, there is no way to press some presses.
Route 7: select the producer's machine. If you have installed other SQL servers first, you can select another SQL Server as the sender server. If you do not have one, you can also use this computer as the sender, SQL Server will help you create a distribution resource on the machine.
Item 8: users can use snapshot sets to send all the resources of the database, therefore, you must first check whether the SQL Server Agent has access permissions. In other words, the migration between resources is required, these are all databases that are related to this resource.
Option 9: select the information to be released. Here, we use the northwind data source as an example.
Tip 10: You can select the type of resource that the SQL server uses, it can be SQL Server 2000, SQL Server 7.0, or non-SQL Server-related data sources. For example, you can select oracle or access data sources. (However, if you choose to submit quality information, you can only send the information to the website quality information through push, instead, you cannot use pull to extract data ).
December 11: select the expected data table or memory storage program. If it is a transactional data table, the data table must have a primary index.
Exercise 12: Set the name of the production set, because a data set may have several production sets.
Statement 13: the producer is setting up the producer set.
Route 14: After the dataset is set, you can set the sequence of the dataset. You only need to select the sequence name of the dataset, then, press "send new response" to see the settings.
15th: select the registrant's name from the list. If there are many referers, in this case, one developer must first use the Enterprise Manager pipeline.
Route 16: select the information to be uploaded to the receiver's machine.
172.16.com: select to update the renewal rate of zookeeper. If it is a transaction-type transaction, you can select "zookeeper". Otherwise, you can use the scheduling, the Snapshot set information is regularly sent to the publisher.
18: If the Creator does not have any information, You must select the initialization structure description and information, as well as the dynamic snapshot SET agent to initialize the processing immediately, in this way, the user-side information will be the initial information.
Nineteen: the author's information is initialized.
Tip 20: You can check whether the production snapshot set has been generated in the zookeeper, and send the information to the user's data terminal.
After that, you can see that the test data has already completed the snapshot set in the snapshot collector, you can switch to the test resource to check the content, such as listen 21:
21st: The employees information in the test dataset has been collected from the entire snapshot set.
The Snapshot set created in Chapter 21 is as fast as you can see how many resources you have and how quickly you decide with the hacker's network, therefore, the snapshot set proxy is the most time-consuming action in practice.
No matter which snapshot method you use, there will be a snapshot set, and the subsequent data changes will look at the features of snapshot.
In this example, we use a transactional token, so after the producer's employees resource makes an animation, I can check the parameter generator again, the following figure shows the content of release 22:
22nd: the change in the transaction segment.
It can be found in November 22. As long as the producer has any activity, it will be discovered by the reporter, then, the transaction information will be sent to the registrant's information to complete the information operation.
Conclusion
The data indexing mechanism is a practical function. Its operation is very generic and is also used in different environments, if the company's resources are to be stored in multiple places for different reasons, you can use the built-in machine provided by SQL Server 2000 to meet your requirements.
PS [transpose plus]: The following statements can be used to find local publications and local subscriptions.
--Sending set
Select publisher_db, publication, description from distribution. DBO. mspublications
--Replica set
Select name, publication, subscriber_db, creation_date from DBO. msdistribution_agents