[Knowledge sorting] data import tools

Source: Internet
Author: User

Previously, I was responsible for db-related tasks. I often needed to write specialized tools to import raw data to the target database after some logic processing. The tools needed to complete the import at the fastest speed without affecting the business.

1. Generally, there is only one data source, and there are multiple target databases. reading from the database is much faster than writing to the database. Pay attention to the data read sequence to ensure simultaneous writing to multiple target databases.

2. The buffer processing process is divided into several segments for parallel execution. To ensure that the entire pipeline is executed in parallel in each stage, a large enough buffer is required between each segment.

3. multithreading + synchronous I/O operations, strong controllability, ensure the maximum number of connections for importing and exporting at the same time on a DB, and adjust the database status and required speed reasonably.

Use the production consumption queue to control the buffer size:

 Public   Class Blockingqueue <t> { Private Queue <t> _ queue; Private   Object _ Sync = New  Object (); Private   Object _ Sync2 = New   Object (); Private   Int _ Capacity; Private Blockingqueueperfcounters _ counters; Public Blockingqueue ( String Queuename, Int Capacity) {_ queue = New Queue <t> (); _ capacity = capacity ;} Public   Void Setcapacity ( Int Capacity) {_ capacity = capacity ;} Public   Int Capacity {get { Return _ Capacity ;}} Public   Int Count {get { Lock (_ Sync ){ Return _ Queue. Count ;}}} Public   Void Enqueue (T item ){ While (This . Count> This . Capacity) {thread. Sleep (1000 );} Lock (_ Sync) {_ queue. enqueue (item);} _ counters. enqueuepersecond. increment (); _ counters. queuelength. increment (); _ counters. enqueuetotal. increment ();} Public   Void Enqueue (ienumerable <t> List ){ While ( This . Count> This . Capacity) {thread. Sleep (1 );} Lock (_ Sync ){ Foreach (T item In List) {_ queue. enqueue (item) ;}}_ counters. enqueuepersecond. incrementby (list. count (); _ counters. queuelength. incrementby (list. count (); _ counters. enqueuetotal. incrementby (list. count ());} Public T dequeue () {T val; Lock (_ Sync) {val = _ queue. dequeue ();} _ counters. dequeuetotal. increment (); _ counters. dequeuepersecond. increment (); _ counters. queuelength. decrement (); Return Val ;} Public List <t> dequeue ( Int Count) {list <t> List = New List <t> (); Lock (_ Sync ){ While (_ Queue. count> 0 & list. count <count) {list. add (_ queue. dequeue ();} _ counters. dequeuetotal. incrementby (list. count); _ counters. dequeuepersecond. incrementby (list. count); _ counters. queuelength. incrementby (-list. count ); Return List ;} Public List <t> tolist (){ Lock (_ Sync ){ Return _ Queue. tolist ();}} Public   Void Clear (){ Int Count = 0; Lock (_ Sync) {COUNT = _ queue. count; _ queue. clear ();} _ counters. dequeuetotal. incrementby (count); _ counters. dequeuepersecond. incrementby (count); _ counters. queuelength. incrementby (-count );}}

Multithreading management:

  Public   Class Multithread <t> { Private Itracing _ tracing = tracingmanager. gettracing ( Typeof (Multithread <t> )); Private Blockingqueue <t> _ queue; Private Thread [] _ threads; Private   Int _ Realqueuelength; Public Action <t> processdata; Public Action <list <t> processdatabatch; Public Multithread ( Int Threadcount, Int Queuecapacity, String Threadname) {_ queue = New Blockingqueue <t> (threadname, queuecapacity); _ threads = New Thread [threadcount]; For ( Int I = 0; I <threadcount; I ++) {_ threads [I] = New Thread (Proc); _ threads [I]. isbackground = True ; _ Threads [I]. Name = String . Format ( "{0 }_{ 1 }" , Threadname, I); _ threads [I]. Start ();}} Public   Void Close (){ Foreach (Thread th In _ Threads) Th. Abort ();} Public   Int Queuelength {get { Return _ Queue. Count ;}} Public   Void Setcapacity ( Int Capacity) {_ queue. setcapacity (capacity );} Public   Void Waitforprocessall (){ While ( True ){If (_ Realqueuelength> 0) thread. Sleep (1 ); Else                      Break ;}} Public   Void Enqueue (ienumerable <t> List) {_ queue. enqueue (list); interlocked. Add ( Ref _ Realqueuelength, list. Count ());} Public   Void Enqueue (T item) {_ queue. enqueue (item); interlocked. increment ( Ref _ Realqueuelength );} Public   Void Proc (){ Try { While ( True ){ While ( This . Queuelength> 0) {processdatalist (_ queue. dequeue (100);} thread. Sleep (1 );}} Catch (Threadabortexception) {thread. resetabort (); Return ;} Catch (Exception ex) {_ tracing. errorfmt (ex,"Proc error" );}} Private   Void Processdatalist (list <t> List ){ If (List = Null | List. Count = 0) Return ; If (Processdatabatch! = Null ){ Try {Processdatabatch (list );} Catch (Exception ex) {_ tracing. Error (ex,"Processdatalist error" );} Finally {Interlocked. Add ( Ref _ Realqueuelength,-1 * List. Count );}} Else   If (Processdata! = Null ){ Foreach (T item In List ){ Try {Processdata (item );} Catch (Exception ex) {_ tracing. Error (ex, "Processdatalist error" );} Finally {Interlocked. decrement ( Ref _ Realqueuelength );}}}}}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.