Kettle data synchronization Speed tuning record

Source: Internet
Author: User

MSYQL to Vertica

1, MySQL in the Openshop database to select one of the data about 300W table

CREATE TABLE Ip_records_tmp_01

As

SELECT * from Ip_records_tmp t

WHERE t.datetime<= ' 2015-05-01 '

2, Vertica CREATE table ip_records_tmp_01, note that the field type and MySQL a little different.

The total amount of 2,478,130 data extraction, time consuming 30s, good speed!

3, add 972,948 data in MySQL, delete 462,151 data, update 273,427 data

New:

INSERT INTO ip_records_tmp_01

SELECT * FROM Ip_records_tmp t

where T. ' datetime ' > ' 2015-05-01 ';

Delete:

Delete from ip_records_tmp_01

where ' datetime ' <= ' 2015-05-01 ' and ' datetime ' > ' 2015-04-01 ';

Update:

Update ip_records_tmp_01

Set remark1 = ' Inman ', Remark2 = ' Leftlife '

where

' DateTime ' <= ' 2015-04-01 ' and ' datetime ' > ' 2015-03-01 '

Incremental extraction, time consuming 2h 3mn 33s, speed unacceptable.

Debug 1, using timestamp

The time stamp is used to extract data incrementally, which is preferred for the data allowed, but does not preclude the time stamp from being implemented, so there are other ways to find it.

Debugging 2, building an index

From MySQL to SQL Server data tables and data volumes are consistent:

Full-scale extraction, time consuming 42S

Indexing SQL Server tables, incrementally extracting the data, takes 10mn 50s, and the speed is also possible.

So in the Vertica in the establishment of similar index projection, the effect is not ideal, the speed has not seen a significant increase, so no, this tuning is not.

Debug 3, Temp table + Run script

Full-scale extraction, time consuming 32s

Incremental extraction, time consuming 1mn 48s, the speed has greatly increased.

Some error resolution methods kettle Memory overflow out of

Workaround: Adjust kettle Call Java's JVM memory size, modify the Spoon.bat file

If "%pentaho_di_java_options%" = = "" Set pentaho_di_java_options= "-xmx4096m" "-xx:maxpermsize=4096m"

If you do not adjust the memory, the kettle will crash if the memory is resized and the large data volume is run.

[Vertica] [VJDBC] (5156): READ COMMITTED

[Vertica] [VJDBC] (5156) ERROR:Unavailable:initiator locks for query-locking failure:timed out X Locking Table. Your current transaction ISOLATION level is READ COMMITTED

Workaround: The number of commit records in the update control is reduced from 10000 to 1000 due to deletion and insert/update transaction interlocks, and the number of commit records in the delete control is reduced from 1000 to 100.

Kettle data synchronization Speed tuning record

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.