Data conversion conflicts and processing of large objects during conversion

Source: Internet
Author: User
Tags keyword list

Data conversion conflicts and Processing
Data conversion conflict:
In the data conversion process, it is difficult to implement strict equivalent conversion. You must determine the syntax and semantic conflicts in the two models. These conflicts may include:
(1) Name Conflict: The identifier of the source data source may be a reserved word in the target data source.
(2) format conflict: the same data type may have different Representation Methods and semantics.
(3) structure conflict: if the data definition models of the two DBMS types are different, for example, relational models and hierarchical models, you need to redefine the object attributes and links, to prevent the loss of attributes or contact information.
(4) type conflict: the same data type of different databases has different precision.
(5) other conflicts: Large Object Types in different databases have different constraints and some special types. For example, if a table contains more than one TEXT or IMAGE field, an error occurs. ORACLE does not allow one or more BLOB and LONG types in a table.
Conflict handling method:
You can handle conflicts in the preceding data conversion.
For name conflicts, You can first check the reserved words in the data source and create a set of reserved words. For name conflicts in the reserved words, rename them as needed.
For format conflicts, you can extract the data types of the corresponding data source from the driver of the data source based on the odbc SQL type, and then perform special processing on some specific types. When character data contains the character "'", escape characters must be used for special processing during data conversion. Otherwise, tokens are mistakenly used as string separators.
For precision conflicts of the same data type in different databases, the odbc SQL type and precision are combined in type conversion to determine the ing relationship between the source data type and the target data type. Find the data type that matches the precision of the source data source in the target data source as the default ing relationship.
In the conversion process, the data type is matched. It is best to convert the data type to the balanced type first, and then perform different processing based on different target data sources. For example, ORACLE uses the TO_DATE function, while FOXPRO uses the ctodd function to convert a string in the date format to a date.
For the TEXT and IMAGE types in SQL SERVER, you need to make a choice during conversion, or set the TEXT IMAGE to VARchar2 (4000), or the IMAGE to the LONG type, however, the Long type can only have one table. For ORACLE, the TEXT type can be IMAGE to CLOB type, while the IMAGE can be IMAGE to BLOB. The CLOB type can contain multiple columns in an ORACLE table.
Read the metadata of the data source.
Metadata type of the Data Source
Metadata type: usage of metadata information
Data source connection information database name, drive, used to connect the source data source and target data source
Server, DSN name,
Data source description, user name, etc.
Table information table name, table owner, used to create a table during data conversion
Table mode, table Type
Column name, type, and width. Used to create a table and map columns during data conversion.
Accuracy, scale, empty or not
Type information type name, maximum column width, used for table creation and type ing during data conversion
Maximum and minimum scale,
Prefix and suffix characters,
Accept null,
Keyword list
Key Information primary key name, primary key column, used for table mode conversion in Data Conversion
Foreign key name, foreign key column,
Foreign key Association Column
Other Object Information: Index Information, stored procedure information, and permission information are used to convert database objects in data conversion.
Data source metadata reading method:
(1) Call odbc api functions:
You can directly call the odbc api function to read the metadata of the data source. The following functions are called: SQL Tables, SQL Columns, SQL DescribeCol, SQL Get TypeInfo, SQL Foreign Keys, SQLPrimary Keys, SQL ProcedureColumns, SQL Procedures, SQL Statistics, SQL TablePrivileges, SQL Column Privileges obtains table information, Column information, type information, key information, and other object information in the data source. The main step is to connect to the data source, allocate the statement handle, and then call the odbc api to obtain various metadata information.
However, directly calling odbc api functions is complicated, and various parameters are difficult to understand, and it is difficult to directly obtain the returned data. The VC ++ MFC class library encapsulates ODBC APIs, which simplifies ODBC calls (especially database record set operations ), however, it is still difficult to obtain the structure information of a database with a different configuration by using the MFC class. Therefore, you need to combine the MFC method with the method of directly calling the odbc api. The ODBC interface function is used to overload some member functions of the CRecordset class in MFC, and create classes such as CTable, CColumns, CTypes, and CPrimaryKeys. Using these newly created classes, you can easily obtain the structure information of a heterogeneous database.
(2) Through ADO object:
The GetTableNames method of the CONNECTION object in ADO can be used to obtain the table information of the data source, and various column information and type information of the dataset can be obtained through the FieldDefs attribute in the RECORDSET object in ADO. Before obtaining the metadata of the data source, you must first create a CONNECTION object to connect to the data source, and open the corresponding data table through the DataSet object RECORDSET. Then, you can obtain the metadata of the corresponding data source.
Data Type Conversion
3 heterogeneous data source data type Overview:
For different database systems, each DBMS defines a set of its own data types. However, no matter how the data types change in each system, its functions meet the basic requirements of user data processing, such as numerical data, including integer, real, floating, and double-precision; fixed, variable-length; date, including year, month, day, hour, minute, and second; long-precision, including text and coin. With the continuous development of database systems and version upgrades, there are also an increasing number of data types, such as hypertext and binary processing of multimedia and big text data types. These commonalities bring about the possibility and convenience of data conversion between systems, but the data types of different databases are also different. The difference between its own definition and expansion also makes it difficult to convert data between systems. For example, the date and time data formats returned by DBMS vary greatly in each DBMS. Some systems return dates and times in the 8-byte integer format, and others return in the floating point format. In addition, some DBMS have the LONG type, while others do not. Therefore, the key to data type conversion in heterogeneous databases is to find the corresponding relationship.
Data Type Conversion Method 1: (design type ing table)
To achieve mutual data conversion, multiple two-way data conversion programs must be designed and different data types must be matched. When a database system is added, the data type matching problem between the database system and multiple existing heterogeneous databases must be solved, and multiple corresponding conversion programs must be added. To achieve program scalability, you can design a type ing table to solve the type conversion problem.
Separate the corresponding data types of different database systems and the corresponding data conversion handler, so that the data conversion program is relatively independent, and the type conversion relationship is stored in a dedicated table structure. Through a detailed and in-depth analysis of data types between different database systems, we can find out the default type mappings and possible mappings between different types of different database systems of different versions, store the data in a type ing table in advance.
Method 2: (using odbc SQL)
Data stored in the data source has a data type, which is called the data source type or SQL data type. SQL data types are defined by every DBMS according to SQL-92 standards and can be unique to a data source. The driver also defines a set of data types in odbc SQL syntax and driver data types, which are known as odbc SQL data types (Data Types starting with the SQL prefix ). Each driver maps the SQL data type from a specific data source to the odbc SQL data type identifier. Therefore, you can use the odbc SQL data type identifier as the benchmark to obtain the default ing relationship between different data sources during data conversion. The driver returns the ing between the SQL data type of the data source and the odbc SQL data type through the SQLGetTypelnfo function. In the SQLCo1Attributes, SQLDescribeCol, and SQLDescribeParm functions, the driver also uses odbc SQL data types to describe the data types of columns and parameters.
In addition, ODBC also provides a set of odbc c data types starting with SQL _C. The odbc c data type specifies the Data Type of the C buffer used to store data in the application. All drivers must support all C data types and convert all C types to the corresponding SQL types. All Drivers must support at least the character SQL type, so that the data types of DBMS can be mapped to a data type in C language, so that the data will not change during the transmission process. Each SQL data type is compatible with an ODBC C data type. Before returning data from the data source, the driver converts it to the specified C data type. Before sending data to the data source, the driver converts it from the specified C data type to the SQL data type.
Processing of large objects during data conversion
Overview of large object types:
BLOB is called Binary Large Objects, which is a Binary Large object. BLOB can be divided into three forms: audio and video data, binary data, and big text data. Therefore, the most common application is to store graphics, sound, and other objects. In addition, large binary and OLE objects can be stored in the database through BLOB. If the Text object is too large, if the length exceeds the specified length of the text type, the BLOB field must be used for storage. In a frequently used programming environment, BLOB fields cannot be directly supported. Therefore, you need to call the corresponding function to complete BLOB usage.
Different database systems support different types of large objects. Table 4 shows the types of big objects supported by common database systems:
The database system supports the following types of big object data:
SQL server "SQL _variant", "ntext", "image", "varbinary", "binary", "text"
ORACLE "BLOB", "long raw", "BFILE", "RAW", "CLOB", "LONG"
SYBASE "LONG VARchar"
VFP "MEMO"
ACCESS "ole object", "MEMO"
KINGBASE "blob", "text", "bytea", "varbinary", "binary", "text"
Methods for accessing large objects:
(1) Use the CLongBinary class provided by MFC:
There are many methods for VC to access big object data, such as OLE and ActiveX. The CLongBinary class provided by VC's MFC can be used to conveniently access BLOB fields. The CLongBinary class can be used to access data that exceeds the MAXINT value, with the maximum available memory capacity. However, the data is completely stored in the memory, which consumes too much data.
(2) Use the SQLGetData and SQLPutData functions of ODBC:
For data that cannot be stored in a single buffer, you can use SQLGetData to retrieve the data from the driver in batches after other data in the row has been obtained. To retrieve long data from a column, the application first calls SQLFetchScroll or SQLFetch to move a row and calls SQLGetData to obtain the data of the bound column. SQLPutData allows parameters or fields to be sent to the driver during application statement execution. This function is used to send characters or binary values.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.