In the previous section, we said that knowledge is the metadata (metadata) in Microsoft sync framework. This metadata is used to describe all changes applied to replica. These changes are either direct, or through synchronization.
MSF uses knowledge to enumerate changes and conflict detection.
Change enumeration and Conflict Detection generally compare the versions of the same item in two copies. A typical practice is that destination sends the version information of all the items to be synchronized to the source. In this way, the metadata to be maintained and transmitted is proportional to the items to be synchronized in the replica (for example, all the recorded version information in the database table must be maintained and transmitted ). MSF introduces the concept of knodge DGE, which stores the current changes of a duplicate in a very compressed form in the internal data structure. (This is difficult to understand and will be explained in the following examples ).
The biggest feature of knowledge is that it uses the data as compact as much as possible to support enumeration change and conflict detection, thus improving synchronization efficiency. Of course, it is also necessary for individual metadata synchronization, per-item synchronization metadata is used to record the time at which the item is modified (Per-item sync metadata-when and where a given item was changed .).
Knowledge operations)
To support functions such as enumeration modification and conflict detection, msf api provides knodge DGE operations. Basically, these operations are some concepts of discrete mathematics. Here we will introduce four main concepts:
Include (contains)-change containment check
Determine whether a specific knowledge contains the specified version (replica key and tick count ). That is, whether the change (whether the replica that owns this knowledge has applied this change) is applied to the replica with this knowledge ).
Determine whether the specified knodge DGE knows the change (whether a given knowledge instance knows about this change ).
Union-knowledge combining
Construct a new knowledge from two knowledge objects. The newly constructed knowledge object contains at least all the changes in any previous knowledge object. (Merges information from another sync knowledge into the current one .)
Project-subset
Obtain the knowledge of a specific item set in the original knodge DGE (obtain a specific subset to form a new knodge dge ).
Exclude-item exclusion
Non-operation of the project, removing the corresponding knodge DGE of the item set to form a new knowledge. (Indicate in the sync knowledge that it doesn't know anything about the item specified. This operation is used during change application to create exceptions in the knowledge .)
Change Enumeration)
The change enumeration process is used to determine which changes of the original replica (source replica) are unknown to the target replica (destination replica.
1. Destination provider sends its current knowledge to the source provider, so the source obtains the current status of the destination (destination );
2. The source provider traverses all items in source replica and performs the following operations:
A. Use the contains operation to determine whether the knodge DGE information sent from destination contains the version information of the current item from the source.
B. If no, the item will be sent to the destination provider.
That is to say, for each item in the source, if the destination knodge DGE does not contain the version information of the item, it is deemed that the item has been modified in the source and needs to be sent to the destination.
For the purposes of the change enumeration the sync solution shocould return a change to the sync destination when destination's knowledge doesn't contain the change.
Example of change enumeration:
Suppose a file synchronization example, A is source (the party that initializes the synchronization) and B is destination.
Each file in the directory is a tracked data item, which is represented by in (such as I1, I2, I3, etc.). According to the previous metadata section, when a file is created (I1), i1 metadata is as follows:
Review the version metadata in the previous section:
Creation replica ID + creation tick COUNT = create version)
Update replica ID + update counter (update tick count) = update version)
When the file is modified, the new metadata (metadata) will look like this:
This is the so-called per-item sync metadata, which records that the file (I1) is modified in replica a (where) when the logical clock is 5 (when.
We can see that the update tick count is changed from 1 to 5, which gives us a clearer understanding of tick count, in a replica, tick count is a normal increment (not repeated). It represents the logic time of modification. For example, changing from 1 to 5 May indicate that other files have been modified, when they are modified, the tick count may be 2, 3, 4.
Tick count is the core of knowledge. In MSF, in all good cases, knowledge only needs to record the replica ID and the current maximum tick count (replica knew changes up-, so now:
Knowledge of A (source) = A5.
For Destination replica B, metadata may look like this:
Knowledge of B (destination) = B4.
If synchronization is performed at this time, destination (B) sends its knodge DGE (B4) to source (a) because the knodge DGE in B does not contain any information about, therefore, source does not know the existence of destination at all. Therefore, source should send the version information of all files to destination.
Note: You can also directly query change enumeration, but knodge DGE makes change enumeration faster.
Change sending)
Change sending sends source changes to destination in batches, and sends the following information to each batch (batch:
- Change itself(The changes themselves)
- Made-with knowledge: Source replica build the knodge DGE after the change batch, made-with knowledge is used to detect conflicts, it indicates the source's knodge DGE (what did you know when you made these changes) after these changes are completed ). [The made-with knowledge is the knowledge of the source replica when the change batch was made]
- Learned knowledge: The current knodge DGE of source replica and the projection (Project) of the items already sent in this batch, there are also recorded conflicts. Learned knowledge replied: what I learned when I applied these changes (what will I learn when I apply these changes ). [The learned knowledge is the knowledge that must be combined with the destination replica's knowledge and saved on the destination replica after the changes are applied]
Sending example:
In the preceding example, when source (replica a) receives the knodge of destination (replica B), it uses the knodge to determine which ONS need to be sent to destination (B ), therefore, the current made-with knowledge is as follows:
When destination receives the versions information and determines which items should be sent from the source (note that versions is different from items, versions is sent first ). Destination also uses this information (made-with knowledge) to determine whether an exception has occurred. After confirmation, destination requests the source to send the modified itmes. In our example, i1, I2, and I3.
After accepting these files, destination adds them to its own folder, and destination completes synchronization from source. B:
Then, after switching between source and destination, execute this process again. After the synchronization is complete, the files of source and destination are consistent. The version information of a is the same as that of B.
Their knodge DGE is also updated simultaneously:
Knowledge of A (source) = A5, B4.
Knowledge of B (destination) = A5, B4.
At this time, their knowledge is the same, and we can see that the number of knodge DGE is proportional to the number of replica involved in synchronization. For example, here there are two replica: A and B, there are two knodge DGE instances. If there is another C, the synchronized knowledge is three.
Conflict Detection)
The process of conflict detection is to identify the operations (modifications or deletions) on one replica that are not promptly notified of another replica. As a result, both replicas make local changes to the same entry.
That is to say,ProgramAfter modification on one replica, It is not synchronized to another replica in time. If another replica also modifies the data, a conflict will occur during the next synchronization.
Use knowledge for Conflict Detection:
If a version in destination replica is not included in the knowledge of source replica (through the contains operation described below ). (A change conflicts with the current state when the version on the destination replica is not contained in the knodge dge of the source replica .)
If a destination version for an item or change unit whose change we're re trying to apply is not contained by the source's made-with knowledge, this indicates that we have an independent update to an item or change unit done both on source and destination otherwise known as a conflict.
Conflict example:
In the previous example, if replica a updates I2, then replica B modifies I2 before synchronizing data with A. In this case, the version information of A and B is as follows:
Knowledge of A (source) = A6, B4.
Knowledge of B (destination) = A5, B5.
At this time, a (as source) synchronizes with B (as destination), skipping some of the processes mentioned in the above example. When SOURCE sends item versions and knowledge to destination, the following steps will be performed on the i2 file:
1. B (destination) found from the knodge DGE OF A that I2 was modified by:
2. B (destination) also found that B also modified i2, but a does not know (not aware ):
3. The conflict is checked out and should be handled by the application or provider.
Reference definition:
The knowledge encompasses all changes (in other words, versions of all items) which a particle sync endpoint knows about.
Knowledge is the metadata that describes all the changes that have been applied to a replica, either directly or though synchronization.
Sync knowledge is the compact representation of all the changes which a participant sync endpoint knows about which is used during change enumeration and Conflict Detection phases of the sync process.
We compress all sync versions in the sync endpoint into the single compact data structure which we call knowledge. the knowledge encompasses all changes (in other words, versions of all items) which a particle sync endpoint knows about.
Refer:
Microsoft sync framework, Part 3: Sync knowledge
Understanding synchronization knowledge
Flyabroad labels: Microsoft synchronization framework, MSF, syncframework, and Synchronization Service