Background: At present, there is a database data about 300 million in the business. If the query directly from the database, wait more than 15 minutes, the user often want to view the data, can only write SQL in the database directly query after drinking a few cups of tea, the results have not come out. The user sees the use of the ES cluster in our project and wants to synchronize the data in the database to the ES cluster.
Software version: logstash-2.2.2, Elasticsearch-2.2.1.
1. Install Logstash-input-mysql plug-in
./bin/logstash Install Logstash-input-jdbc-3.0.0.gem
2. Configuration files
Input { jdbc { jdbc_driver_library = "/software/logstash-2.2.2/jdbc/ojdbc6.jar" Jdbc_driver_class = > "Java::oracle.jdbc.driver.oracledriver" jdbc_connection_string = "jdbc:oracle:thin:@ 192.168.118.115:1521:ORCL " jdbc_user =" System " Jdbc_password =" Manager123 "jdbc_page_size = 20jdbc_fetch_size = jdbc_paging_enabled = true schedule = "0/5 * * * * *" statement = "Sele CT id,user_name,terminal_id from operation_log where id>=: Sql_last_value ORDER by USER_ID and
Operation_time <:sql_last_value + 1000000
Clean_run = False Record_last_run = True Use_column_value = True Tracking_column = user_id Last_run_metada Ta_path = "/software/logstash-2.2.2/logstash_jdbc_last_oracle_run"}} output {stdout{codec=>rubydebug{}} elasticsearch{index=> "julong-%{+yyyy. MM.DD} "hosts=>" 10.64.252.104:9200 "}}
Description: The primary key ID in the user database is the self-growing type.
At this point, the configuration is complete. After the data into ES, in a single containing 7000w+ data query, 0.3 seconds or so can be detected results.
Reprint please indicate the source!
Logstash synchronizing data from a database