Research on extracting relational data from XML of clob Field
The following functions are used in this article: Extract, extractvalue, existsnode, xmlsequence, xmltype, xmltable, and xmlquery. The specific syntax of the function is not described here.
Before extracting data, use the xmltype function to convert clob data into XML data.
1. The data in XML is a single table and only one row of data.
This situation is simple and fast. Example:
Select
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/action_code') action_code,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/route_id') route_id,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/route_actn_code') route_actn_code,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/deal_org_code') deal_org_code,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/office_code') office_code,
To_date (extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/deal_datetime'), 'yyyy-mm-dd hh24: MI: ss') deal_datetime,
To_date (extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/act_datetime'), 'yyyy-mm-dd hh24: MI: ss') act_datetime,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/post_way_code') post_way_code,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/route_kind_code') route_kind_code,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/bag_count') bag_count,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/bag_weight_sum') bag_weight_sum,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/flight_info') flight_info,
To_date (extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/create_time'), 'yyyy-mm-dd hh24: MI: ss') create_time
From (select xmltype (v_msg) v_msg, d_in_time from run $ log)
Where existsnode (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info') = 1;
2. The data in XML is a master-slave table relationship.
In this case, the xmlsequence function must be used to convert a child table to a nest table when the child table is extracted; otherwise, a ora-22905 is reported. In addition, if the amount of data from a table reaches a thousand orders of magnitude, the speed is very slow.
2.1 use the table () function xmlsequence
Example:
Select
Extractvalue (value (t), '/bag/end_org_code') end_org_code,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/route_id') route_id,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/route_actn_code') route_actn_code,
Extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/deal_org_code') deal_org_code,
To_date (extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/deal_datetime'), 'yyyy-mm-dd hh24: MI: ss') deal_datetimed,
To_date (extractvalue (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/act_datetime'), 'yyyy-mm-dd hh24: MI: ss') act_datetime,
Extractvalue (value (t), '/bag/bag_action') bag_action,
Extractvalue (value (t), '/bag/bag_id') bag_id,
Extractvalue (value (t), '/bag/label_strip') label_strip,
Extractvalue (value (t), '/bag/start_org_code') start_org_code
From (select xmltype (v_msg) v_msg, d_in_time from run $ log ),
Table (xmlsequence (extract (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/bags/bag') T;
2.2 Use xmltable () and xmlquery () functions the usage of these two functions is not described, see the official documentation: http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14259/xdb_xquery.htm
Oracle xml db must be installed when using these two functions. This method is 2.1 faster than 50%. Example:
Select extractvalue (v_msg, '/gpdic_xml/bag_detail_infos/bag_detail_info/bag_id') bag_id,
Xtab. mail_num, xtab. mail_action, xtab. mail_remark_code, xtab. mail_other_remark
From run $ log_test,
Xmltable ('for $ J in/gpdic_xml/bag_detail_infos/bag_detail_info/mails/mail
Return $ J'
Passing v_msg
Columns mail_num varchar2 (20) path '/mail/mail_num ',
Mail_action varchar2 (1) path '/mail/mail_action ',
Mail_remark_code varchar2 (20) path '/mail/mail_remark_code ',
Mail_other_remark varchar2 (50) path '/mail/mail_other_remark') xtab
Where existsnode (v_msg, '/gpdic_xml/bag_detail_infos/bag_detail_info')> 0; supplement: Table (xmlsequence (extract (v_msg, '/gpdic_xml/route_detail_infos/route_detail_info/bags/bag') T/gpdic_xml/route_detail_infos/route_detail_info/bags/bag is the XML slave table path.