MSSQL Server Data import: Code of Conduct

Source: Internet
Author: User
Tags mssql mssql server split alphanumeric characters advantage

One of the first relational DBMS I used was Microrim ' sr:base4000.r:base, unlike its PC competitor dBASE, which was a true relational database management system that was in the early 1980s as a nasarim (relationship information management, relationalinformationmanagement) A PC version of the system developed. One of the features I appreciate most is that it allows users to view sample data during the import process. Although tools such as BCP, DTS, new SQLServer2005 Integration Services, and various porting tools and wizards have already automated the data import process into SQL Server, this does not mean that we can once and for all. This month, Rontalmage provides some very good commonsense advice about importing data.

SQLSERVERDBA (databaseadministrator, database administrator) find that they often use T-SQL to import and process data. Why, then? Because some data transfer requires the powerful capabilities of technically mature SQL. Recently I just finished another case of data import, which touched me by compiling a list of behavioral specifications for my use.

Make sure that the original data that is loaded is staged as a varchar data type

Raw data originating from so-called legacy systems is usually routed in text format, so I always load raw data into a separate staging database first. I never try to load data directly into a production database.

The thing I do is load all the original text data into the corresponding original table, and the columns in the table are varchar data types. (DTS will automatically complete the process, which is good.) However, DTS also names the columns as COL001, so you do not have to provide the column names beforehand. The main advantage of varchar is that it can receive any data-even "bad" data. If you try to load data from a legacy system that does not rigorously check the data entered by the user, the data that is ignored or written to the exception file may be more than the data being loaded, if you do not want to take the risk unless you receive every possible value. You can do this by loading characters into columns that are varchar data types.

Do not use non-alphanumeric characters when staging table/column names

You may not be able to control how the column is initially named in the table that contains the original data, but I will try to modify the old column name that may contain spaces or other unconventional characters. When column names or table names contain non-alphanumeric characters, we must separate them with square brackets or double quotes. This code is not only difficult to write, but also less readable.

Do not use keywords in column names

Data originating from legacy systems typically contains descriptive column names that can break SQL queries. For example, real estate data might contain a column named Key, which is used to reflect a key box placed on a house for sale. However, key is also a keyword in T-SQL (!). , if you use such a column name, the query operation will fail when the column name is referenced directly. As a result, you must end up with square brackets or double quotes separating column names that contain keywords.

Make sure that you create a staging table with the correct data type

The next step is to create one or more additional staging tables that have the "correct" data type. I like to have the same column names for the destination tables in the staging table and target OLTP (onlinetransactionprocessing, online transaction processing) databases. In any case, it is important that the data types of each column in the original data are checked and corrected when loading the staging. It is much easier to find bad data in a SQL Server table than to find bad data in an external file that fails to load.

Make sure new columns are added to the staging table

When the staging data does not have a corresponding column, you can add the columns and then split or merge the loaded data. For example, even if the destination table breaks down the street name and number, the address can still be loaded into the staging table as a simple string. Then you can add the street name and the number column to the staging table and decompose the old address into two columns. The advantage of this is that the raw data coexists with the newly split data, so you can test the script by comparing columns.

Be sure to test the populated product data with local copies

When you are ready to insert the data for the staging table, you can first test the data by inserting it into the local copies of the production table. Sometimes you just have to empty the table; sometimes you have to populate the table.

Ensure that product constraints are retained

Product constraints are always retained on the secondary table. This allows you to test the extent to which the staging table data satisfies these constraints. These constraints include NULL, default values, checks, primary keys, and foreign key constraints. First, ensure that the null or Notnull property on the secondary table column is the same as the target system, and then step through all other constraints. If your test indicates that the staging data insertion process satisfies all constraints, you are only a step away from success.

Make sure to test on a copy of the product data

Although inserting the imported data into an empty table will encounter many potential problems, you will not encounter all the problems. After all previous tests have been passed, make sure that you will test the import on a copy of the target data or production system (or at least a reasonable subset). The type of final error you can receive is determined by the data configuration, and this is what this test can detect. You can then check the results in the database copy and even redirect the application to that copy for further testing and verification. Columnist Tommoreau added, "Testing with daily product update data can be a preparation for a data transplant." If the original system does not have enough constraints and the new system does, then bad data will go into the original system and disrupt your migration. "-ed."

If the import process passes all the tests at this point, you may be ready to import the data, or at least hand the import process to the Quality Management Officer (Qa,qualityassurance).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.