[Chapter 3. Detailed Configuration [section C],

Source: Internet
Author: User
Tags hadoop ecosystem

[Chapter 3. Detailed Configuration [section C],
3.6. Routers

The Route implementation provided in the current implementation includes:

1. Default Router: This Router sends all data to all nodes in the group to which the target node defined in the Router belongs.
2. column Match Router: This Router can set the old value of a Column (the value of this Column in the data source table) or new value (the value to be set on the target node) compare with the values of a constant value or node external_id and node_id.
3. Lookup Router: This Router can be configured. When routing data, determine whether to perform routing operations based on existing tables or slave tables.
4. Subselect Router: This Router executes an SQL expression on the database to select the node to route. This SQL expression can be used to pass in the old values (values before data changes) or new values (values after data changes) of a column ).
5. Scripted Router: This Router executes a Bean script expression to select the node to route. This script can use the old value or new value of a column.
6. XML Publishing Router: This Router directly publishes data changes to a message solution, instead of transmitting these changes to a registered node. The Router must be manually configured as an extension point in XML format. (It can be combined with kafka, and then the data can enter the hadoop ecosystem)
7. Audit Table Router: This Router inserts data into an automatically created Audit Table. This Router records the changes in the captured data to the associated table.

Trigger and Router are many-to-many relationships. This means that a trigger can capture data changes and then route these changes to multiple locations. It also means that a Router can be associated with a different trigger.

3.6.1. Default Router

The simplest Router is the one that sends data captured by all related triggers to the router of all nodes in the group to which the target node defined in all Router belongs. A Router is represented by a row of data in the ROUTER table. Connect to the TRIGGER_ROUTER table.

The following SQL statement defines a Router, which sends data from the "corp" group to the "store" group.

insert into SYM_ROUTER (router_id,               source_node_group_id, target_node_group_id, create_time,               last_update_time) values ('corp-2-store','corp', 'store',               current_timestamp, current_timestamp); 

The following SQL statement establishes the ing between the preceding Router and the item trigger.

insert into SYM_TRIGGER_ROUTER               (trigger_id, router_id, initial_load_order, create_time,               last_update_time) values ('item', 'corp-2-store', 1, current_timestamp,               current_timestamp); 

3.6.2. Column Match Router

Sometimes there may be such a requirement that the data needs to be routed based on the current value of the data (after the data changes) or the previous value of a column (before the data changes. You can configure a column ROUTER when setting router_type to Column when inserting a Router system table, and then set router_expression as an equation, which represents the expected values of a column.

The first part of the expression is always a column name. This column name should always be defined in uppercase letters. An uppercase column name with the OLD _ prefix can be used to compare the OLD values of this column. The second part of an expression can be a constant. One can represent the symbol of another column or another symbol that represents the concept of javasricds. The value of the symbol always starts with a colon.

When the Status column of a table is set TO "ready to send", data changes are routed TO all nodes in the target node group. The following SQL statement inserts a column ROUTER in the router table to meet this requirement.

insert into SYM_ROUTER (router_id,               source_node_group_id, target_node_group_id,router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store-ok','corp','store', 'column', 'STATUS=READY TO SEND',               current_timestamp,current_timestamp);

When the value of the Status column of a table is modified, the changed data is routed to all nodes in the target group. The following SQL statement inserts a column ROUTER in the router table to meet this requirement. Note the use of OLD_STATUS. The OLD _ prefix can access the OLD value of this column.

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store-status','corp','store', 'column', 'STATUS!=:OLD_STATUS',               current_timestamp,current_timestamp);

The data changes of a table are sent to the node where the external id of a node in the target node group matches the value of the STORE_ID column of the table. The following SQL statement inserts a column ROUTER in the router table to meet this requirement.

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store-id','corp','store', 'column', 'STORE_ID=:EXTERNAL_ID',               current_timestamp,current_timestamp);

The following three attributes of a node can be referenced:

1.: NODE_ID

2.: EXTERNAL_ID

3.: NODE_GROUP_ID

During the route process, you can always capture the value of EXTERNAL_DATA as a virtual column.

Data changes in a table need to be routed to a redirection node. The external id of this node is defined in the REGISTRATION_REDIRECT table. The following SQL statement inserts a column ROUTER in the router table to meet this requirement.

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression, create_time,last_update_time) values               ('corp-2-store-redirect','corp','store', 'column',               'STORE_ID=:REDIRECT_NODE',current_timestamp, current_timestamp);

You can configure multiple columns in router_expression. When more than one column is configured, all matched nodes will be added to the list of nodes to be routed. The following is a simple example. in a table, the value of the STORE_ID column must be equal to the EXTERNAL_ID or ALL constant of a node. If it is equal to the ALL constant, ALL nodes will receive data updates.

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store-multiple-matches','corp','store', 'column',               'STORE_ID=ALL orSTORE_ID=:EXTERNAL_ID', current_timestamp,               current_timestamp);

The NULL keyword can be used to check whether a column is empty. If the column is empty, data is routed to all nodes. In the following example, when the STORE_ID column is used to route to a node whose EXTERNAL_ID is equal to STORE_ID, if STORE_ID is null, all nodes are routed.

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store-multiple-matches','corp','store', 'column',               'STORE_ID=NULL orSTORE_ID=:EXTERNAL_ID', current_timestamp,               current_timestamp);

3.6.3. Lookup Table Router

A lookup table may contain the ID of the node where data is to be routed. This may be an existing table or an affiliated table specially added for routing data. Lookup Table Router is configured by specifying the value of router_type as lookuptable when inserting a ROUTER. You can set multiple configuration parameters in the router_expression column.

Each of the following configuration parameters is required.

LOOKUP_TABLE

The name of the Lookup table.

KEY_COLUMN

Name of the column in the table to be router. This will be inserted into the lookup table as the primary key.

LOOKUP_KEY_COLUMN

The name of the primary key column in The Lookup table.

EXTERNAL_ID_COLUMN

This is the name of a column, which contains the external_id of the node to be routed in the lookup table.

Note that the lookup table will be read to the memory and cached for data usage during route to a channel.

Considering that a table may be routed to a specific store, the changed table data only contains brand information. In this example, the STORE table can be used as a lookup table.

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store-ok','corp','store', 'lookuptable', 'LOOKUP_TABLE=STORE               KEY_COLUMN=BRAND_IDLOOKUP_KEY_COLUMN=BRAND_ID               EXTERNAL_ID_COLUMN=STORE_ID',current_timestamp, current_timestamp);

3.6.4. Subselect Router

Sometimes, you need to determine whether to route the current data based on the data that is not being synchronized. Subselect Router can be used in this case. Configure an SQL query statement in the router_expression column of the Subselect Router. This SQL statement returns the list of IDs of nodes to be routed. Column symbols can be used in SQL expressions. These symbols can be replaced by specified columns in a row of data. This type of Router has a high overhead because the Subselect statement runs once for each row to be routed. This should not be used in tables with many rows to be updated. This type of Router also has a disadvantage. If the data used to determine the ID of the node to be routed has been deleted, the SQL statement will not return results and the routing operation will not happen.

The Router_expression you specified can be added to the following SQL statement to select the node ID:

select c.node_id from sym_node c where               c.node_group_id=:NODE_GROUP_IDand c.sync_enabled=1 and ...

As you can see, you can use the alias "c" to access the information of the node currently being processed, such as c. external_id. There are two node-related symbols that you can use in your expression:

1.: NODE_GROUP_ID

2.: EXTERNAL_DATA

Indicates that your data list must have a colon prefix, for example, ": EMPLOY_ID" or ": OLD_EMPLOYEE_ID ". Here, the "OLD _" prefix indicates that this value is the value before the data changes.

For example, consider the following situation: an "Order" table and an "OrderLineItem" Table need to be routed to a specific store. The "Order" table has a table named "order_id" and "STORE_ID. A store node has an external_id equal to the STORE_ID of the "Order" table. However, the "OrderLineItem" table has only one foreign key to the "order_id" of the "Order" table. To route "OrderLineItem" data to the node to be routed to the "Order" table, we need to reference the records of the "Order" table.

There are two possible ways to solve this problem in javasricds. One is to configure a "Subselect" type Router, as shown below (the other is to use external_select to capture data through the trigger, and then use this data match in the column router, the detailed description is in Section3.6.7, "Utilizing External Select when Routing").

Our solution is to use Subselect to compare the external id of the current node and the value of the STORE_ID column in the Order table. When order_id is equal to the order id of the current row, this data is routed:

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store','corp','store', 'subselect', 'c.external_id in (select               STORE_ID from order whereorder_id=:ORDER_ID)', current_timestamp,               current_timestamp);

Finally, note that in this example, the row associated with the OrderLineItem table in the Order table must exist because the select statement is running when a route occurs, the select statement is not run when data changes are captured for the first time.

3.6.5. Scripted Router

Scripted router may be used when you need to flexibly select the logic of the node to be routed. Currently, the available scripting language is Bean Shell. BeanShell is a Java Script Language. Documentation for the Bean Shell scripting language can be viewed in the http://www.beanshell.org.

The router_type of the Bean Shell Router is "bsh". Route_expression is a valid BeanShell script:

1. Add a node ID to the collection of the target node

2. Return a new set containing multiple node IDs.

3. Return the ID of a node.

4. If all nodes in the target group are routed, true is returned. Otherwise, false is returned.

The script returns the node list. These nodes are valid org. jumpmind. Ric. model. Node objects. The current value and old value of the data column can be used in the script. A Java object is used to represent the data of a column. Specify the name in uppercase. The OLD value must be prefixed with "OLD.

If you need to access the metrics ricds service, you can use the engine Variable to access an instance of the org. jumpmind. Metrics Ric. isymetrics ricengine interface.

In the following example, node_id is a combination of STORE_ID and WORKSTATION_NUMBER. Both are columns in the table to be routed:

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store-bsh','corp','store', 'bsh', 'targetNodes.add(STORE_ID +               "-" +WORKSTATION_NUMBER);', current_timestamp, current_timestamp);

You can simply return the Node ID for the same function. This value is returned by the headquarters of the last line of the Bah script.

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store-bsh','corp','store', 'bsh', 'STORE_ID + "-" +               WORKSTATION_NUMBER',current_timestamp, current_timestamp);

In the following example, if the FLAG column is changed, the data will be synchronized to all nodes. If the FLAG is not changed, it will not be synchronized to any node. Note: here we use the "OLD _" prefix to access the OLD value of the column.

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store-flag-changed','corp','store', 'bsh', 'FLAG != null               &&!FLAG.equals(OLD_FLAG)', current_timestamp,               current_timestamp);

In the following example, the script traverses every valid node and checks whether the column named STATION removes spaces at both ends and whether it is equal to the value of external_id.

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type,               router_expression,create_time, last_update_time) values               ('corp-2-store-trimmed-station','corp','store', 'bsh', 'for               (org.jumpmind.symmetric.model.Nodenode : nodes) { if (STATION != null               &&node.getExternalId().equals(STATION.trim())) {               targetNodes.add(node.getNodeId());} }', current_timestamp,               current_timestamp);

3.6.6. Audit Table Router

Audit Router captures data changes by recording data changes in the Audit table created and updated by a router (as long as auto. config. database is set to true ). The Router creates a table with the suffix "_ AUDIT" added to the table name of the data to be captured. This table will have the same structure as the original table and three additional columns.

Three additional audit columns are added to the table:

1. AUDIT_ID: Primary Key of the table

2. AUDIT_TIME: time when data changes occur

3. AUDIT_EVENT: DML type of data that occurs in this row

The following is an example of creating an audit router:

insert into SYM_ROUTER (router_id,               source_node_group_id,target_node_group_id, router_type, create_time,               last_update_time)values ('audit_at_corp','corp', 'local', 'audit',               current_timestamp,current_timestamp);

Audit Router captures data for a group of connections. In order for the above AuditRoute to work, it must use the "R" action type and contact a node_group_link ." R "represents" Only Route To ". In the above example, we associate it with a local group. Here, the local Group is newly created for the audit router. No node data local node group. If one of the corp nodes is triggered by an audit Router Contact, a new audit table will be created on the corp node, and new data will be inserted into the audit table.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.