MSYQL to Vertica
1, MySQL in the Openshop database to select one of the data about 300W table
CREATE TABLE Ip_records_tmp_01
As
SELECT * from Ip_records_tmp t
WHERE t.datetime<= ' 2015-05-01 '
2, Vertica CREATE table ip_records_tmp_01, note that the field type and MySQL a little different.
The total amount of 2,478,130 data extraction, time consuming 30s, good speed!
3, add 972,948 data in MySQL, delete 462,151 data, update 273,427 data
New:
INSERT INTO ip_records_tmp_01
SELECT * FROM Ip_records_tmp t
where T. ' datetime ' > ' 2015-05-01 ';
Delete:
Delete from ip_records_tmp_01
where ' datetime ' <= ' 2015-05-01 ' and ' datetime ' > ' 2015-04-01 ';
Update:
Update ip_records_tmp_01
Set remark1 = ' Inman ', Remark2 = ' Leftlife '
where
' DateTime ' <= ' 2015-04-01 ' and ' datetime ' > ' 2015-03-01 '
Incremental extraction, time consuming 2h 3mn 33s, speed unacceptable.
Debug 1, using timestamp
The time stamp is used to extract data incrementally, which is preferred for the data allowed, but does not preclude the time stamp from being implemented, so there are other ways to find it.
Debugging 2, building an index
From MySQL to SQL Server data tables and data volumes are consistent:
Full-scale extraction, time consuming 42S
Indexing SQL Server tables, incrementally extracting the data, takes 10mn 50s, and the speed is also possible.
So in the Vertica in the establishment of similar index projection, the effect is not ideal, the speed has not seen a significant increase, so no, this tuning is not.
Debug 3, Temp table + Run script
Full-scale extraction, time consuming 32s
Incremental extraction, time consuming 1mn 48s, the speed has greatly increased.
Some error resolution methods kettle Memory overflow out of
Workaround: Adjust kettle Call Java's JVM memory size, modify the Spoon.bat file
If "%pentaho_di_java_options%" = = "" Set pentaho_di_java_options= "-xmx4096m" "-xx:maxpermsize=4096m"
If you do not adjust the memory, the kettle will crash if the memory is resized and the large data volume is run.
[Vertica] [VJDBC] (5156): READ COMMITTED
[Vertica] [VJDBC] (5156) ERROR:Unavailable:initiator locks for query-locking failure:timed out X Locking Table. Your current transaction ISOLATION level is READ COMMITTED
Workaround: The number of commit records in the update control is reduced from 10000 to 1000 due to deletion and insert/update transaction interlocks, and the number of commit records in the delete control is reduced from 1000 to 100.
Kettle data synchronization Speed tuning record