In the previous article XMS/XMX/XSS the tuning settings in Kette, I optimized the kettle once again, this time, it is very obvious, this time the optimization has two parts: first, modify the JVM add Xmn, second, modify the log output level
In the Java TM Performance book, there Is this passage:
-xmn is convenient to size Both the initial and maximum size of the young generationspace. it is important to note that if-xms and -xmx are not set to the same value and-xmn is used, a growth or contraction in the Java heap size will not adjust the size ofthe young generation space. the size of the young generation space will remain constant with any growth or Contraction of the java heap size. therefore, -xmn shouldbe used only when -xms and -xmx are set to the same value.
OK, look at my changes
# ******************************************************************# ** set java runtime options **# ** change 512m to higher values in case you run out of memory **# ** or set the pentaho_di_java_options environment variable **# ** ( Javamaxmem is there for compatibility reasons) **# ***************************************************** if [ -z "$JAVAMAXMEM" ]; then javamaxmem= "16384" fiif [ -z "$PENTAHO _di_Java_options " ]; then pentaho_di_java_options="-Xms${JAVAMAXMEM}m -Xmx${ javamaxmem}m -xmn6144m -xss1024m "fi
XMX for physical memory is XMX for 1/4,XMN 3/8
When calling the *.job file with kitchen.sh, add the following call to the command
-level:error
In the default case, the kettle output is the basic log, if access to a hundred thousand of of the database, the basic log output will also reach 5, 600 trillion, which seriously affect the efficiency of execution, so modify the kettle log level is the error level, only output error log.
After the above optimization, traverse a hundred thousand of database for data statistics, only 2 hours and 55 minutes, cool Ah, fast, if the hardware is good enough, it can also improve a lot of it!
Let the kettle speed up the execution.