MSSQLServer data import: behavior Specification

Last Update:2018-06-09 Source: Internet

Author: User

Tags dbase mssqlserver alphanumeric characters

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

One of the first relational dbms I used was MicrorimsR: Base4000.R: Base. What is different from the PC competitor dBase is that it is a real relational database management system, developed as a PC version of The NASARIM (RelationalInformationManagement) system in early 1980s. What I like most

One of the first relational dbms I used was Microrim 'sr: Base4000.R: Base. What is different from PC competitor dBase is that it is a real relational database management system, developed as a PC version of The NASARIM (RelationalInformationManagement) system in early 1980s. One of my favorite features is that it allows users to view sample data during the import process. Although tools such as bcp, DTS, and the new SQLServer2005 integration service, as well as various migration tools and wizard have already automated the data import process to SQLServer, this does not mean that we can do it all once and for all. This month, RonTalmage provided some good common-sense suggestions on data import.

SQLServerDBA (DatabaseAdministrator, Database Administrator) found that they often use T-SQL to import and process data. Why? Because some data transmission requires the powerful functions of sophisticated SQL. Recently, I just completed another data import case. This case touched me to compile a list of code of conduct for my use.

Ensure that the loaded raw data is saved to the varchar Data Type

The raw data from the so-called old-style system is usually transmitted in text format, so I always load the raw data into a separate temporary database first. I never try to load data directly into a finished database.

What I do is to load all the original text data into the corresponding original table, and the column in the table is of the varchar data type. (DTS will automatically complete this process, which is good. However, DTS will also name the column COL001, so you do not need to provide the column name beforehand .) The main advantage of varchar is that it can receive any data-or even "bad" data. If you try to load data from an old-style system that does not strictly check user input data, the ignored data or data written into the abnormal file may be more than the loaded data, if you do not want to take this risk, unless you receive every possible value. This can be done by loading characters into a varchar data type column.

Do not use non-alphanumeric characters when saving table/column names

You may not be able to control how columns are initially named in a table that contains raw data, but I will try to modify the old column names that may contain spaces or other unconventional characters. When the column name or table name contains non-alphanumeric characters, we must use square brackets or double quotation marks to separate them. This type of code is not only difficult to compile, but also less readable.

Do not use keywords in column names

Data originating from legacy systems usually contains descriptive column names that can damage SQL queries. For example, real estate data may contain a column named KEY, which is used to reflect the KEY boxes placed on houses for sale. However, KEY is also a keyword in T-SQL (!), If such a column name is used, the query operation fails when the column name is directly referenced. Therefore, you must use square brackets or double quotation marks to separate the names of columns containing keywords.

Make sure to use the correct data type to create a temporary table

The next step is to create one or more additional temporary tables with the "correct" data type. I like to make the destination tables in the temporary tables and target OLTP (OnlineTransactionProcessing, online transaction processing) databases have the same column names. In any case, it is important that the data type of each column in the raw data will be checked and corrected during the loading and temporary storage. Finding bad data in an SQL Server table is much easier than finding bad data in an external file that fails to be loaded.

Make sure to add new columns to the temporary table.

When temporary data does not have columns, you can add these columns and split or merge the loaded data. For example, even if the target table breaks down the street name and house number, the address may still be loaded into the temporary table as a simple string. You can add the street name and house number columns in the temporary table, and split the old address into two columns. The advantage of this is that the raw data and the new split data coexist, so you can test the script by comparing columns.

Make sure that the local copy is used to test the filled Product Data

When you are ready to insert the data of the temporary table, you can first test the data by inserting it into the local copy of the finished table. Sometimes you only need to clear the table; sometimes, you must fill the table.

Ensure Product constraints are retained

Product constraints are always retained on the secondary table. In this way, you can test the degree to which the temporary table data meets these constraints. These constraints include NULL, default, check, primary key, and foreign key constraints. First, make sure that the NULL or NOTNULL attribute on the secondary table is the same as that of the target system, and then check all other constraints. If your test shows that the temporary data insertion process meets all the constraints, you are only one step away from success.

Make sure to test on a copy of product data

Although inserting imported data into an empty table may cause many potential problems, there will be no such problems. After all previous tests are passed, make sure that you will test the import on a copy of the target data or finished product system (or at least a reasonable subset. The final error type that you can receive will be determined by the Data configuration, and this test can detect. Then you can check the results in the database copy, or even redirect the application to the copy for further test and verification. [Columnist TomMoreau adds, "testing with daily finished product update data can be prepared for data migration. If the original system does not have enough constraints and the new system has them, the bad data will enter the original system and destroy your transplantation ." -Ed .]

If the import process has passed all tests, you may be ready to import data, or at least you can submit the import process to the quality administrator (QA, QualityAssurance.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More