More than 10 times faster MySQL write speed of PentahoKettle

Source: Internet
Author: User
PentahoKettle is used to capture data and store it into the database. The speed is about 7500 seconds when text is used for commissioning, but after being replaced with a database, the speed is only 150 seconds, it takes more than 20 minutes to import about 0.2 million of data into the database, which is unacceptable. Batch insert does not seem to have any effect, but it is still slow for Google to find

Pentaho Kettle is used to capture data and store it into the database. In the general example, the speed of text debugging is about 7500 records/second, but after changing to the database, the speed is only 150 records per second. It takes more than 20 minutes to import about 0.2 million of data to the database, which is unacceptable. Batch insert does not seem to have any effect, but it is still slow for Google to find

Pentaho Kettle is used to perform a data capture and warehouse receiving program, which is roughly like this:



During text commissioning, the speed is about 7500 records/second, but after the database is changed, the speed is only 150 records/second, and about 0.2 million of data is stored in the database for more than 20 minutes, this is unacceptable.

Batch insert does not seem to have any effect, but it is still slow.



For help with Google, I found two references:

Reference 1: http://julienhofstede.blogspot.nl/2014/02/increase-mysql-output-to-80k-rowssecond.html

Reference 2: http://forums.pentaho.com/showthread.php? 142217-Table-Output-Performance-MySQL #9

The description is basically the same. The connection parameters are adjusted according to the reference:

Java code

  1. Useserverprepsponts = false
  2. RewriteBatchedStatements = true
  3. UseCompression = true

useServerPrepStmts=falserewriteBatchedStatements=trueuseCompression=true

The data write speed is immediately increased to about 2300 records per second. With three threads enabled, the write speed is similar to that of text writing.

It takes a long time to write code, and Kettle is also used for a while. Generally, it only pays attention to how Tranaction implements the function. It can be said that connection parameters are basically not concerned, however, such a speed improvement is in front of you. If you are not familiar with these performance parameters, it is really not easy to Improve the Performance. If you want to write it out, you also hope to have a reference when you encounter the same problem, because Baidu, ^ _ ^ is not found at all. If you want to write Chinese characters, you want to include them.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.