Discuz! NT Cache Design Analysis

Source: Internet
Author: User
Tags xml xpath

Download the sample source code or material in this article

As a community-type software, high concurrency support and efficient and stable operation will always be the "final principle", and the effective and secure use of cache can achieve exactly half the result. The cache mechanism provided by. NET itself seems too thin. For example, the customization is not flexible and convenient, the sense of parallelism between cache objects is not strong, and there is a lack of unified governance during use.

Discuz! Background of NT cache generation:

I joined Discuz in February! The NT project team found that the cache mechanism was not used in advance for this project. The main reason is that the project is still in its infancy, and many things are just an idea, but they have not been put into practice, or have not found a proper solution, and the cache is one of them should not be used in the end, it is assumed that the database pressure and development costs can be greatly reduced.

I have a good prototype (source code from a book), that is, Discuz today! The prototype of the cache mechanism used by NT, but in advance it is not functional and has some "Fatal" bugs, but it is more than enough to implement simple cache data objects, so I used a simple test case (cache data table and StringBuilder object) to discuss and analyze the data with snowman, basically, it is affirmed that the cache solution is used to cache frequently accessed but not updated data in the database image. At the same time, it is required that the cache mechanism should be used as simple as possible, at the same time, function expansion is very convenient.

Therefore, I have basically made some functional extensions and BUG fixes in this prototype to get this part of the code that everyone can see today.

Now Discuz! The Cache architecture of NT is described as follows. Please take a look at Discuz! NT architecture diagram:

  

In fact, this architecture is simply a standard "policy" model. To make it easier to compare, I put the structure of the Policy Model

The figure is shown below:

  

As you can see, the DNTCache is the application scenario of the "Policy" mode, and the DefaultCache, ForumCache, RssCache and so on are the corresponding specific policies, each of which will be correct. the cache mechanism provided by net is customized"

To achieve different purposes. For example, the system DefaultCache provides a data re-loading mechanism when the object expires, while ForumCache does not use this mechanism. In addition, there are several different cache expiration policies, these are all tailored to specific application scenarios.

Here, all you need to do is download a source code and press the "" button to clarify the entire cache mechanism.

The following describes several technologies used in cache design. This includes XML, XPATH, "single-piece mode", and cross-site data sharing.

First, take a look at the Code: (xml xpath)

1 // path in the xpath format to be accessed
2 // the object to be cached
3 public virtual void AddObject (string xpath, object o, string [] files)
4 {
5
6 // sort the XPATH expression information
7 string newXpath = PrepareXpath (xpath );
8 int separator = newXpath. LastIndexOf ("/");
9 // find the relevant group name
10 string group = newXpath. Substring (0, separator );
11 // find related objects
12 string element = newXpath. Substring (separator + 1 );
13
14 XmlNode groupNode = objectXmlMap. SelectSingleNode (group );
15 // create a unique key value for the object to map the key of the XML and cached object
16 string objectId = "";
17
18 XmlNode node = objectXmlMap. SelectSingleNode (PrepareXpath (xpath ));
19 if (node! = Null)
20 {
21 objectId = node. Attributes ["objectId"]. Value;
22}
23 if (objectId = "")
24 {
25 groupNode = CreateNode (group );
26 objectId = Guid. NewGuid (). ToString ();
27 // create a new element and an attribute for this perticular object
28 XmlElement objectElement = objectXmlMap. OwnerDocument. CreateElement (element );
29 XmlAttribute objectAttribute = objectXmlMap. OwnerDocument. CreateAttribute ("objectId ");
30 objectAttribute. Value = objectId;
31 objectElement. Attributes. Append (objectAttribute );
32 // create new elements for the XML document
33 groupNode. AppendChild (objectElement );
34}
35 else
36 {
37 // create a new element and an attribute for this perticular object
38 XmlElement objectElement = objectXmlMap. OwnerDocument. CreateElement (element );
39 XmlAttribute objectAttribute = objectXmlMap. OwnerDocument. CreateAttribute ("objectId ");
40 objectAttribute. Value = objectId;
41 objectElement. Attributes. Append (objectAttribute );
42 // create new elements for the XML document
43 groupNode. ReplaceChild (objectElement, node );
44}
45 // Add a new object to the cache
46 cs. AddObjectWithFileChange (objectId, o, files );
47
48}
49

Why should we use XML? It is mainly used to add, replace, and remove the hierarchical functions and related nodes in XML, in addition, it is easy to perform "persistence" operations on the cached structure information.

XPATH allows you to search for XML files through hierarchical expressions.

With the above or other similar code, you can build an xml tree to manage cache objects that have been added to the system.

Use the "single-piece mode" to generate a globally unique "Application Scenario", because caching is usually the best in terms of data storage and sharing, and encoding is also the most casual implementation and governance, at the same time, the project itself basically caches frequently accessed but infrequently changed database data (which can be viewed as shared data). Therefore, it is logical to use the single-piece mode.

See the following code:

Public static DNTCache GetCacheService ()
{
If (instance = null)
{
Lock (lockHelper)
{
If (instance = null)
{
Instance = new DNTCache ();
}
}
}
// Check and remove the corresponding cache items
// Note: the code here is the code class in the version 2.0 to be released. If you want to know
// For the code, see the Discuz. Forum. cachefactory. cs file in the open-source version.
// Corresponding function
Instance = CachesFileMonitor. CheckAndRemoveCache (instance );
Return instance;
}

Episode:

1. data cannot be shared across web parks when the project reaches beta. It works like this. When you set two or more WEB parks in the application pool of the IIS service, when you update the cache in the background, the cache data is not updated or rotated. To put it bluntly, only the data cache in one application process is updated, while all the data in the other processes remains unchanged. This problem is mainly caused by the fact that the static data instance (that is, the objects in all the preceding single code) is "unique" in the current process ", however, it is caused by other processes. I was surprised at the beginning. Why can't Microsoft provide a technology or key word to share data across WEB parks, just like the technology that provides "global" hooks, however, I also guessed one or two points. It is certain that multiple WEB parks are a "solution" that makes programs (WEB) run more securely and stably and quickly ". It is hard for anyone to say that their program has no bugs, that is, such code exists. However, when the runtime environment is involved, it may be difficult to control.

However, Microsoft uses the web garden technology to isolate programs running in several different processes, so that no one of them will affect anyone, even if one of the processes goes down, other processes will continue to work normally ". Therefore, the object instance and all resources in the program are saved in the same process. Assuming that the sharing mechanism is referenced, all the processes may be completed when the data or program objects shared by the process are faulty. Therefore, process isolation is required.

But we still need to find a solution to the problems we face in advance. I remember a meeting with Lao Liang while I was at the hero's work, he said that the speed of CPU access to memory is similar to that of hard disk access in some cases. If I do not understand it, for example, "virtual cache" or the latest frequently accessed hard disk segment, the code or files in these places have high running and access efficiency. Therefore, I thought of using the file flag association method to solve this multi-process problem. Next, we naturally use the file modification date attribute to determine whether to update the cache in a multi-process environment. You can go to the config folder in the open-source download package and put a cache. config File, corresponding to the latest data items, and then look back at the following code will be clear:

Public static DNTCache CheckAndRemoveCache (DNTCache instance )//
{
// When cache. config changes while the program is running, the cached object will be deleted.
Cachefilenewchange = System. IO. File. GetLastWriteTime (path );
If (cachefileoldchange! = Cachefilenewchange)
{
Lock (cachelockHelper)
{
If (cachefileoldchange! = Cachefilenewchange)
{
// When there is an item to be cleared
DataSet dsSrc = new DataSet ();
DsSrc. ReadXml (path );
Foreach (DataRow dr in dsSrc. Tables [0]. Rows)
{
If (dr ["xpath"]. ToString (). Trim ()! = "")
{
DateTime removedatetime = DateTime. Now;
Try
{
Removedatetime = Convert. ToDateTime (dr ["removedatetime"]. ToString (). Trim ());
}
Catch {;}
If (removedatetime> cachefilenewchange. AddSeconds (-2 ))
{
String xpath = dr ["xpath"]. ToString (). Trim ();
Instance. RemoveObject (xpath, false );
}
}
}
Cachefileoldchange = cachefilenewchange;
DsSrc. Dispose ();
}
}
}
Return instance;
}

2. in addition, it should be noted that the cache mechanism encountered some problems in February, such as cache data loss and. in the case of an endless loop in net2, the snowman suggested that each cache should have a cache sign to solve the data loss problem. That is, the following code snippet:

1 // when adding
2 public virtual void AddObject (string xpath, DataTable dt)
3 {
4 lock (lockHelper)
5 {
6 if (dt. Rows. Count> 0)
7 {
8 AddObject (xpath + "flag", CacheFlag. CacheHaveData );
9}
10 else
11 {
12 AddObject (xpath + "flag", CacheFlag. CacheNoData );
13}
14 AddObject (xpath, (object) dt );
15}
16}
17
18
19 // At Retrieval
20 public virtual object RetrieveObject (string xpath)
21 {
22 try
23 {
24 object cacheObject = RetrieveOriginObject (xpath );
25 CacheFlag cf = (CacheFlag) RetrieveOriginObject (xpath + "flag ");
26
27 // when the flag BIT contains data
28 if (cf = CacheFlag. CacheHaveData)
29 {
30 string otype = cacheObject. GetType (). Name. ToString ();
31
32 // when the cache type is data table Type
33 if (otype. IndexOf ("Table")> 0)
34 {
35 System. Data. DataTable dt = cacheObject as DataTable;
36 if (dt = null) | (dt. Rows. Count = 0 ))
37 {
38 return null;
39}
40 else
41 {
42 return cacheObject;
43}
44}
45
46}
47

The major cause of the endless loop is the cache callback loading mechanism under. net2 and a BUG in the program itself, which has been fixed. Please feel free to use it.

Features currently developed but not used:

1. one-click multi-value: see AddMultiObjects (string xpath, object [] objValue) in the DNTCache code segment. You can use the object [] RetrieveObjectList (string xpath) method to return the value, in this way, an xpath can be used to access a group of objects.

Its implementation code is relatively simple, so I will not talk about it here, just paste the code here.

public virtual bool AddMultiObjects(string xpath,object[] objValue)
{ 
lock(lockHelper)
{
 //RemoveMultiObjects(xpath);
 if (xpath != null && xpath != "" && xpath.Length != 0 && objValue != null)
 {
  for (int i = 0; i < objValue.Length; i++)
  {
  AddObject(xpath + "/Multi/_" + i.ToString(),objValue[i]);
  }
  return true;
 }
 return false;
}
}

2. Batch cache removal: This method uses XML to store data in a hierarchical path. It removes the cached data of all child nodes located in the current path.

Its function declaration is as follows: RemoveObject (string xpath, bool writeconfig) its implementation code is relatively simple, so we will not talk about it here, just paste the code here.

1 public virtual void RemoveObject (string xpath, bool writeconfig)
2 {
3 lock (lockHelper)
4 {
5 try
6 {
7 if (writeconfig)
8 {
9 CachesFileMonitor. UpdateCacheItem (xpath );
10}
11
12 XmlNode result = objectXmlMap. SelectSingleNode (PrepareXpath (xpath ));
13 // check whether the path points to a group or cached instance Element
14 if (result. HasChildNodes)
15 {
16 // delete information of all objects and subnodes
17 XmlNodeList objects = result. SelectNodes ("* [@ objectId]");
18 string objectId = "";
19 foreach (XmlNode node in objects)
20 {
21 objectId = node. Attributes ["objectId"]. Value;
22 node. ParentNode. RemoveChild (node );
23 // Delete the object
24 cs. RemoveObject (objectId );
25}
26}
27 else
28 {
29 // Delete element nodes and related objects
30 string objectId = result. Attributes ["objectId"]. Value;
31 result. ParentNode. RemoveChild (result );
32 cs. RemoveObject (objectId );
33}
34
35 // check and remove the corresponding cache items
36}
37 catch
38 {// indicates that the current path does not exist if an error occurs.
39}
40}
41}
42
43

 

This feature has been enabled but removed.

Before the official version came into being, there was a function to record the cache log in the background governance, which was implemented in the "visitor" mode (you should find this type of LogVisitor in the project ). However, many webmasters reported that the number of log table operations was too frequent, which led to a sharp increase in log records. However, they took down this function. I want to remind you that you should be very cautious about the pursuit of new features or new technologies. Otherwise, you will be able to develop a feature that requires a great deal of effort, in the end, no one bought the account and it was depressing.

Finally, we need to explain why we should first bring this feature to the garden. Because our product's Discuz! The NT2.0 product will be released soon, and the architecture of the entire product has also undergone many changes. The cache structure is relatively stable, so it has not changed much. This is what I sent a BLOG today to tell you. Next article about DISCUZ! The article on the NT architecture will wait until the official version is released. After downloading the code, let's take a look at the new Code and talk about other design ideas of this product (based on my understanding ).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.