EF architecture ~ Extended a paging Method for Big Data Processing, ef Architecture
Back to directory
Recently, we have encountered big data problems. it is not practical to process tens of millions of data at a time. Therefore, we need to process big data in blocks or by page, I have written a similar article in the EF architecture. It was used in BulkInsert for batch insertion of big data. Now I have taken it out and put it in the IQueryableExtensions class, that is to say, it will appear as an extension of IQueryable. We can apply the paging processing logic more extensively and provide Asynchronous Parallel versions in this arrangement, it is dozens of times faster than the same version. It can be said that the current server can play its role only after it is used and computed!
/// <Summary> /// process data by page in parallel to improve system utilization, improve system performance /// </summary> /// <typeparam name = "TEntity"> </typeparam> /// <param name = "item"> </param> /// <param name = "method"> </param> public async static Task DataPageProcessAsync <T> (
IQueryable <T> item,
Action <IEnumerable <T> method) where T: class {await Task. run () =>{ DataPageProcess <T> (item, method );});} /// <summary> /// process data by page on the main thread // </summary> /// <typeparam name = "T"> </typeparam> // /<param name = "item"> </param> // <param name = "method"> </param> public static void DataPageProcess <T> (
IQueryable <T> item,
Action <IEnumerable <T> method) where T: class {if (item! = Null & item. count ()> 0) {var DataPageSize = 100; var DataTotalCount = item. count (); var DataTotalPages = item. count ()/DataPageSize; if (DataTotalCount % DataPageSize> 0) DataTotalPages + = 1; for (int pageIndex = 1; pageIndex <= DataTotalPages; pageIndex ++) {var currentItems = item. skip (pageIndex-1) * DataPageSize ). take (DataPageSize ). toList (); method (currentItems );}}}
As a matter of fact, with the above method, we can transfer the IQueryable result set and the method to be processed in the future for data sharding, which is extremely convenient!
The following code is a self-selected FastSocket project. The code used for Big Data Transmission
# Region paging data transmission DataPageProcessAsync (model, (list) => {client. send ("DSSInsert", 1, 1, item. name // VersionHelper. getNumber (ProjectID. newLearningBar), SerializeMemoryHelper. serializeToBinary (list), res => res. buffer ). continueWith (c => {if (c. isFaulted) {throw c. exception;} Console. writeLine (BitConverter. toBoolean (c. result, 0) ;};};# endregion
I tried the synchronous method DataPageProcess and the parallel Asynchronous Method DataPageProcessAsync, which is at least dozens of times faster than the former. Of course, this is related to your CPU, the number of threads processed by your CPU exceeds the upper limit!
Back to directory