1. Overview
We will parse the XML file from the external system and deposit it into the database.
But we don't have a DTD or schema, there is only one document in Word format; more to the point, the structure of the XML node tree (the relationship between the XML node and the XML node) and the structure of the business bean tree (that is, the relationship between the business bean and the business bean) are not exactly the same, for example, From a business point of view, a pig has a head, and in XML, but written in the pig--content--pighead three-level relationship, the end of a more than a content node! Without Dtd/schema, the structure is not standard, we can not use the automated Third-party Java Transformation API to parse, but only manually, one by one to parse. But in the process of manual parsing, we still find that there are many things in common in the parsing and warehousing of each node. Or there is a common rule that these things can be drawn out as a quasi frame, and then open the different parts of the node, allowing specific nodes to do the specific implementation, and eventually form a semi-automatic interpretation/ Warehousing Framework.
Why does it say it's semi-automatic? What are its limitations?
Automatic: Do not have to write XML parsing code and inbound code for each node
"Half": you need to write each javabean manually, and manually build a table for each bean
Limit:
A. The types of all business fields can only be set to String/varchar, and the type of the non-business field cannot be string in the bean
The B.bean name must be the same as the table name, or it can be mapped one-to-one
The C.bean member variable name must be the same as the attribute name/element name of the XML node, or it can be mapped one-to-one
These three kinds of restrictions are the premise of using the Java reflection mechanism for automatic operation.
2. Basic ideas
The so-called XML parsing is to convert the XML node into JavaBean instance, and the attribute value and element value of the XML node are the member variable values of the JavaBean instance; The so-called persistence is to turn the JavaBean instance into a record in the database table, and the member variable value of the JavaBean instance is the value of a field in the record, or another record in another table that references the record.
In XML, in the JavaBean system, the relationship between node and node is tree-shaped in the relational structure of data table. The whole parsing and warehousing is to perform the conversion action when traversing the tree. And we know that the traversal of the tree can be achieved by recursive algorithm, and recursion, needless to say, it is the implementation of the program "Automation" one of the main ways.
The following is a detailed analysis of the "trees":
Suppose there is an aggregation relationship (parent-child relationship) between two business entities A and B. Then the concrete can be divided into three kinds of situations:
A.B is an atomic field (that is, it cannot be divided) and is an attribute of a.
In XML, B is the XML attribute of a or the atomic element of a
In the Bean, B is a member variable, and B is a Java built-in data type
Database, B is a column of table A
B.B is a compound field and is a property of a, and a is a 1:1 relationship
In XML, B is the element of a, and B has its own element or attribute
In the Bean, B is a member variable, and there is a Class B in the program
In the database, the B table is the child table of Table A (that is, the B foreign key references a table)
C.B is a compound field and is a property of a, and a is a n:1 relationship
In XML, B is the element of a, and B has its own element or attribute
In a bean, B makes up a class set (List,set) as a member variable, and there is a Class B in the program
In the database, the B table is the child table of Table A (that is, the B foreign key references a table)
Understanding these three kinds of situation, the next is good to do. Each node that is caught by the program has to do the following recursively: deal with its atomic properties (case a), then deal with its individual sub node (case b), and finally process its Class collection node (case c).