CYQ. Data V5 Distributed Automatic Cache Design Introduction, cyq. datav5

Source: Internet
Author: User

CYQ. Data V5 Distributed Automatic Cache Design Introduction, cyq. datav5
Front:

Before I completed this function, I was thinking: I wrote my thoughts first, discussed them with everyone, and then realized them, or wrote articles to discuss my own thoughts.

Suddenly, a voice came out of your head and said: you will enter the daze stage after posting.

So let me calm down and let me gently complete the code.

In the past few days, I have rehearsed various technical difficulties and solutions in my brain. I have been thinking about the various problems to be solved and decided on the issue code.

I suddenly found that the original code can be written in the brain.

If you see an employee sitting for 2 days without writing a line of code, it means that they are experts and are programming in the brain.

Okay, don't pull it. Let's go back to the text!

Why is the second-level cache of traditional ORM invalid?

Some ORM will provide: such as Hibernate.

Some are not provided: for example, EF is not provided because you know it is provided and it is useless because:

1: You cannot force a project to use single-entity programming. When multiple tables are used, users prefer to execute SQL statements.

2: There is no distributed cache as the basis, and the cache policies deployed for multiple applications cannot be solved.

Therefore:

1: if you cannot control the SQL statements of the entire project user, you cannot use a single machine.

2: without the foundation of distributed cache, distributed operations cannot be implemented.

This is why EF has never been provided, because we can see that although Hibernate is provided, it is not very useful!

 

Wondering why the database already has a cache, why should the Framework commit?

Main reasons:

1: it takes time for the database to go from request to cache creation (Framework cache can reduce the pressure on database cache failure)

2: There is a limit on the number of connections in the database. It is impossible to allow a large number of concurrent direct connections. External pressure is required.

3: The database cache is standalone.

4: The database sends data to the server for a longer time than the local cache.

Some ideas before the automatic Cache Design:

1: At first, I thought about the Cache Policy, which was refined to rows or columns. So I learned about the database's own cache and found that the database is only made in tables.

2: MSSQL provides a cache dependency that provides SqlDependency. It can notify you of when your data expires at the database level.

3: however, SqlDependency and SqlCommand are too dependent and cannot be used at all databases.

4: The cache dependency of SqlDependency can only be cached locally.

5: Other databases do not support dependency notifications.

6: Therefore, the solution can only process cache and failure policies through global execution and analysis.

7. Single-Host: how to analyze the table for global interception analysis?

8: when the application is distributed: Does the cache expire in a timely manner?

9: when users directly modify the database: How does the cache become invalid?

There are still a lot of questions that I have been thinking about ......

What is cached?

1: when a single object is cached, the object is directly archived. When the Cache returns, it selects whether to Clone the object locally or remotely ..

2: When caching the list, only the fields of the archive type include: (number, Boolean, character, time, GUID), and are converted to a Json string for archiving.

Technical details:

A: when an object is archived on the local machine, the archived object is referenced (A write error may occur). When the archive is distributed, the archived object is not referenced, which leads to uncertainty during use.

B: The Archiving of large objects requires serialization and deserialization between the caches, reducing the performance a lot.

Therefore, converting the list into A Json archive and restoring it after obtaining it can solve both A and B problems.

 

Simply put, if the object has a long field or binary data, it will not be cached, so the Timestamp field of MSSQL will not be used;

If you want to use: AppConfig. DB. HiddenFields = "field name", you can hide it.

 

Cache Time?

Because the cache time is usually at noon and evening when the traffic is low, the cached object time is randomly distributed (the morning distribution is invalid at noon, and the afternoon distribution is invalid at night)

When querying by page, we usually focus on the previous pages. Therefore, the data of the previous pages is distributed over the time period above.

The data after the page is cached for 2 minutes by default.

Other rules to be discussed ......

 

What is the cache size?

1: When the memory usage rate is lower than 15% in the standalone status, the cache is no longer accepted.

2: In the distributed cache status, how much is thrown at the moment.

Cache failure:

1: Request Interception: (including (MAction) addition, deletion, modification, and query + (MProc) Execution of custom statements + (MDataTable) Batch Method)

2: Join table of the Analysis Statement (the name of a single table can be obtained, the name of the table involved in the join structure of the view, and the Stored Procedure (currently unavailable ), custom SQL (table name after Statement Analysis), batch (table name directly)

3: technical difficulties: How to accurately analyze all associated tables from unknown SQL statements or views.

4. cache invalidation: add, delete, modify, execute ExeNonQuery and batch statement.

5: Technical difficulties:

1: For views (how can I associate multiple tables with a view statement invalid based on a table name ?)

2: For distributed applications, service A updates, and server B also fail.

 

How to deal with frequently modified tables:

1: At first, I want to add configurations so that users can set tables that do not participate in the cache. After careful consideration, I find that a table is frequently modified based on the cache expiration time and number of times.

2: When adding, deleting, and querying table operations, the table is set to invalid (the related cache will be removed). At this time, the time interval (6 seconds) is set ), during this period, the related table is not cached, And the cache deletion commands submitted can also be ignored.

3: What should I do for tables that have been analyzed as frequently modified? Extend the cache duration, or ?? You still need to think about it !!!

 

Can cache invalidation granularity be small?

1: the current failure, like the database, is in the table unit.

2: The insert operation does not affect the reading of a certain piece of data (therefore, the query of a single piece of data should not be affected by the insert operation)

3: Is there any other situation that will inevitably not be affected?

 

Does the Framework have automatic cache, and does the service need to be cached?

1: The database is automatically cached, and the framework can also be automatically cached.

2: The Framework has automatic cache, and the business can also have cache.

 

The granularity that the framework can process is limited and cannot be subdivided into the cache level of specific rows or columns. Therefore, after business complexity and concurrency reach a certain amount, business cache is necessary.

 

The database has a cache and the business can also be used as a cache. Why do you still think about adding an automatic cache to the framework?

1: The default cache of the database is fixed and needs to be configured. Different database environments are inconsistent.

2: The database connection pool is fixed by default.

3: Business caching is often a later action.

Another current situation is:

1: There are many intermediate developers in the. NET group. The technical growth of these developers is relatively slow and they are not familiar with cache or performance tuning.

2: There are many small and medium-sized websites in China. By default, they cannot defend against concurrency, and the attack cost is very small. Hundreds of thousands of concurrent jobs can be suspended.

Therefore, since there are real problems, there can be corresponding solutions.

The emergence of this function in the V5 framework is to solve these problems from the basic level.

Only when there are no slow websites in the. NET industry, the overall upgrade level and good reputation will more bosses be introduced to use. NET.. NET spring is approaching..

 

V5 currently solves the following problems:

In general, the core of this function is to solve the following problems:

I will sell the technology at 1.1:

5. AOP Interception Problems:

First, to implement this function, we have to intercept it globally, scan the source code or use V5, and hear that the framework itself has AOP;

Second, we have to transform this AOP: the framework has an empty AOP by default. When the external has an AOP load, this empty AOP will be replaced.

To implement this automatic cache: I wanted to implement it in the empty AOP, but it would be a waste. However, if the custom Aop is loaded, it will be replaced and cannot be implemented...

I thought about three or four projects, thought about three or four nights, and finally confirmed the current model when I got the verification code. (This tells us that the verification code should be used when I thought about it, it is not reliable if you want to connect to the platform by 100% ):

So I did this:

The original Aop was renamed InterAop, but it was named because it did not inherit the IAOP interface and changed from the original Singleton to the multi-sample mode.

Here we can paste two lines of code, which means that the external AOP interface is called in the Bengin and End methods, and the subsequent execution process is determined based on the external AOP status:

Complete source code: SVN:Https://github.com/cyq1162/cyqdata.git

Public AopResult Begin (AopEnum action) {AopResult ar = AopResult. Continue; if (outerAop! = Null) {ar = outerAop. Begin (action, Para); if (ar = AopResult. Return) {return ar ;}} if (AppConfig. Cache. IsAutoCache &&! IsTxtDataBase) // returns {isHasCache = AutoCache as long as it is not directly returned. getCache (action, Para); // check whether there is a Cache} if (isHasCache) // find the Cache {if (outerAop = null | ar = AopResult. default) // do not execute End {return AopResult. return;} return AopResult. break; // external Aop statement: you also need to execute End} else // without Cache. By default, {return ar;} public void End (AopEnum action) {if (outerAop! = Null) {outerAop. End (action, Para);} if (! IsHasCache &&! IsTxtDataBase) {AutoCache. SetCache (action, Para); // check whether there is a Cache }}

There were few codes at the end, but I couldn't figure it out for two days.

1: Basic single table and view operations

A: single table. This is the simplest one. The table name is passed in;

B: view. In this case, the view name is passed;

Therefore, how do I obtain the names of the participating tables from the view? You should not know it now. Let me tell you:

DBDataReader sdr=....DataTable dt = sdr.GetSchemaTable();

This statement can be used to kill all databases. You do not need to search for metadata in N databases !!!

2: Multi-Table SQL statement operation:

For SQL statements, you can use the following method to execute a DataReader and get it again, but I have made a simple method to find the associated table:

Internal static List <string> GetTableNamesFromSql (string SQL) {List <string> nameList = new List <string> (); // obtain the original table name string [] items = SQL. split (''); if (items. length = 1) {return nameList;} // if (items. length> 3) // select * from xxx {bool isKeywork = false; foreach (string item in items) {if (! String. isNullOrEmpty (item) {string lowerItem = item. toLower (); switch (lowerItem) {case "from": case "update": case "into": case "join": case "table": isKeywork = true; break; default: if (isKeywork) {if (item [0] = '(' | item. indexOf ('. ')>-1) {isKeywork = false;} else {isKeywork = false; nameList. add (NotKeyword (item) ;}} break ;}}} return nameList ;}

You may find multiple tables. After finding them, you can filter out whether the name is a table in the database.

3: directly operate the database

The idea of setting up at the beginning is to dynamically create a table with the following fields:

Table Name Update Time

Then, if you manually operate the database, you can manually change the time or trigger the update here.

Then, the background thread regularly scans the table to see if the table has been updated.

However, -------- V5 currently implements this function. It only opens an interface that allows you to call in the Code to remove the cache.

This method is:

Public abstract partial class CacheManage {// <summary> // obtain the cache Key in the system /// </summary> public static string GetKey (CacheKeyType ckt, string tableName) {return GetKey (ckt, tableName, AppConfig. DB. defaultDataBase, AppConfig. DB. defaultDalType) ;}/// <summary> /// obtain the cache Key in the system /// </summary> public static string GetKey (CacheKeyType ckt, string tableName, string dbName, dalType dalType) {switch (ckt) {case CacheKeyType. schema: return TableSchema. getSchemaKey (tableName, dbName, dalType); case CacheKeyType. autoCache: return AutoCache. getBaseKey (dalType, dbName, tableName);} return string. empty ;}}
4. Cross-server operations

This was originally simple, and later I had to worry about it because the cache may be removed frequently due to performance issues.

Later, the cache type was added to identify the local cache or distributed cache to differentiate code writing:

Private static void SetBaseKey (string baseKey, string key) {// baseKey is a table, excluding the view and custom statement if (_ MemCache. cacheType = CacheType. localCache) {if (cacheKeys. containsKey (baseKey) {cacheKeys [baseKey] = cacheKeys [baseKey]. append ("," + key);} else {cacheKeys. add (baseKey, new StringBuilder (key) ;}} else {StringBuilder sb = _ MemCache. get <StringBuilder> (baseKey); if (sb = null) {_ MemCache. set (baseKey, new StringBuilder (key);} else {sb. append ("," + key); _ MemCache. set (baseKey, sb );}}}
6: cache failure

The process is simple:

However, when we think about a large number of caches and are distributed, the returned results will become stuck, so the delete Cache operation will become thread processing.

Later, in order to avoid multiple threads, the class was changed to a singleton (multiple instances at the beginning)

Now, we have enabled the thread and put it in LocalCache to work with another thread. Then the singleton class is changed to a static class.

 

How to use this function in the V5 framework:

Upgrade the version to the latest version!

 

Summary:

1: without this function: the Framework solves three major problems: unified programming architecture (automation), Database pressure (read/write splitting), and server pressure (distributed cache ).

2: This function is intended to improve the overall level of industrial projects from the basic level.

3:

Think about the architecture, implement the Framework Code, write text and share, write the framework Demo, and answer questions in the group.

Today I came up with another idea: automatic deletion (automatically recursion of all associated Foreign keys based on a primary key ).

4: open source does not make money, and has invested so much energy that it can only be regarded as an ideal. I hope it will one day become the standard data layer of the. NET project.

5: I have a thumb up plug-in on my blog.

PS: in Guangzhou, I have a relative (college student) Who wants to work for a summer job. Welcome to know, Q me.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.