Solution to data asymmetry between MySQL and Elasticsearch: mysqlelasticsearch

Source: Internet
Author: User

Solution to data asymmetry between MySQL and Elasticsearch: mysqlelasticsearch

Solutions to data asymmetry between MySQL and Elasticsearch

Jdbc-input-plugin can only be used to append data to a database. elasticsearch writes data incrementally. However, databases at the First-end of the jdbc source may be deleted or updated. In this way, the database is asymmetrical with the search engine database.

Of course, if you have a development team, you can write programs to synchronize operations on search engines when they are deleted or updated. If you do not have this capability, try the following methods.

Here is a data table article. The mtime field defines the on update CURRENT_TIMESTAMP, so the time of every mtime UPDATE changes.

mysql> desc article;+-------------+--------------+------+-----+--------------------------------+-------+| Field    | Type     | Null | Key | Default            | Extra |+-------------+--------------+------+-----+--------------------------------+-------+| id     | int(11)   | NO  |   | 0               |    || title    | mediumtext  | NO  |   | NULL              |    || description | mediumtext  | YES |   | NULL              |    || author   | varchar(100) | YES |   | NULL              |    || source   | varchar(100) | YES |   | NULL              |    || content   | longtext   | YES |   | NULL              |    || status   | enum('Y','N')| NO  |   | 'N'              |    || ctime    | timestamp  | NO  |   | CURRENT_TIMESTAMP       |    || mtime    | timestamp  | YES |   | ON UPDATE CURRENT_TIMESTAMP  |    |+-------------+--------------+------+-----+--------------------------------+-------+7 rows in set (0.00 sec)

Logstash adds mtime query rules

Jdbc {jdbc_driver_library => "/usr/share/java/mysql-connector-java.jar" jdbc_driver_class => "com. mysql. jdbc. driver "jdbc_connection_string =>" jdbc: mysql: // localhost: 3306/cms "jdbc_user =>" cms "jdbc_password =>" password "schedule =>" ***** "# regular cron expression, statement => "select * from article where mtime>: SQL _last_value "use_column_value => true tracking_column =>" mtime "tracking_column_type =>" timestamp "record_last_run => true last_run_metadata_path =>"/var/tmp/article-mtime.last "}

Create a recycle bin table to delete the database or disable status = 'n.

CREATE TABLE `elasticsearch_trash` ( `id` int(11) NOT NULL, `ctime` timestamp NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8

Create a trigger for the article table

Create definer = 'dba '@' % 'TRIGGER' article _ BEFORE_UPDATE 'before update on 'Article' for each rowbegin -- the logic here is to solve the problem when the article state changes to N, you need to delete the data in the search engine. If new. status = 'n' THEN insert into elasticsearch_trash (id) values (OLD. id); end if; -- when the logic here is to modify the state to Y, the elasticsearch_trash method still has this article ID, resulting in accidental deletion. Therefore, you need to delete the recycle records in the recycle bin. If new. status = 'y' THEN delete from elasticsearch_trash where id = OLD. id; end if; endcreate definer = 'dba '@' % 'TRIGGER' article _ BEFORE_DELETE 'before delete on 'Article' for each rowbegin -- the logic here is that the deleted article will be put into the search engine recycle bin.. Insert into elasticsearch_trash (id) values (OLD. id); END

Next, we need to write a simple Shell that runs every minute to retrieve data from the elasticsearch_trash data table, and then use the curl command to call the elasticsearch restful interface to delete the recovered data.

You can also develop related programs. Here is an example of a Spring boot timing task.

Entity

package cn.netkiller.api.domain.elasticsearch;import java.util.Date;import javax.persistence.Column;import javax.persistence.Entity;import javax.persistence.Id;import javax.persistence.Table;@Entity@Tablepublic class ElasticsearchTrash { @Id private int id; @Column(columnDefinition = "TIMESTAMP DEFAULT CURRENT_TIMESTAMP") private Date ctime; public int getId() { return id; } public void setId(int id) { this.id = id; } public Date getCtime() { return ctime; } public void setCtime(Date ctime) { this.ctime = ctime; }}

Warehouse

package cn.netkiller.api.repository.elasticsearch;import org.springframework.data.repository.CrudRepository;import com.example.api.domain.elasticsearch.ElasticsearchTrash;public interface ElasticsearchTrashRepository extends CrudRepository<ElasticsearchTrash, Integer>{}

Scheduled task

Package cn. netkiller. api. schedule; import org. elasticsearch. action. delete. deleteResponse; import org. elasticsearch. client. transport. transportClient; import org. elasticsearch. rest. restStatus; import org. slf4j. logger; import org. slf4j. loggerFactory; import org. springframework. beans. factory. annotation. autowired; import org. springframework. scheduling. annotation. scheduled; import org. springframework. stereotype. component; import com. example. api. domain. elasticsearch. elasticsearchTrash; import com. example. api. repository. elasticsearch. elasticsearchTrashRepository; @ Componentpublic class ScheduledTasks {private static final Logger logger = LoggerFactory. getLogger (ScheduledTasks. class); @ Autowired private TransportClient client; @ Autowired private ElasticsearchTrashRepository alasticsearchTrashRepository; public ScheduledTasks () {}@ Scheduled (fixedRate = 1000*60) // run the scheduling task public void cleanTrash () {for (ElasticsearchTrash elasticsearchTrash: alasticsearchTrashRepository. findAll () {DeleteResponse response = client. prepareDelete ("information", "article", elasticsearchTrash. getId () + ""). get (); RestStatus status = response. status (); logger.info ("delete {}{}", elasticsearchTrash. getId (), status. toString (); if (status = RestStatus. OK | status = RestStatus. NOT_FOUND) {alasticsearchTrashRepository. delete (elasticsearchTrash );}}}}

Start the main program in Spring boot.

package cn.netkiller.api;import org.springframework.boot.SpringApplication;import org.springframework.boot.autoconfigure.SpringBootApplication;import org.springframework.scheduling.annotation.EnableScheduling;@SpringBootApplication@EnableSchedulingpublic class Application { public static void main(String[] args) { SpringApplication.run(Application.class, args); }} 

The above is an explanation of the solution to the data asymmetry between MySQL and Elasticsearch. If you have any questions, please leave a message or go to the community on this site for discussion. Thank you for reading this article and hope to help you, thank you for your support!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.