When we build a database and want to aggregate the different types of databases scattered across the database in this new database, especially when it comes to data validation, decontamination, and transformation, there are big challenges. Fortunately, SQL Server provides us with powerful, rich data import and export capabilities, and when importing exports, we can flexibly handle the data.
There are three main ways to import Export data in SQL Server: Use Transact-SQL to process data, call command-line tools bcp to process data, and use Data Transformation Services (DTS) to process data. These three methods each have their own characteristics, the following are the main characteristics of their comparison.
A comparison of the way of use
1. Using Transact-SQL for data import and export
It is easy to see that a Transact-SQL approach is a way of importing or aggregating data from the same or different types of databases into one another in a way that can be done by means of statements. If you are importing data into and out of different SQL Server databases, it is very easy to do so. You can generally use SELECT INTO and insert INTO. The table into which you use SELECT into is required to exist, meaning that its function is to create an empty table before the data is directed, and then import the data from the source table into the newly created empty table, which is equivalent to copying the table (and not replicating the table's index, etc.). The function of INSERT into is to insert the source data into a table that already exists, you can use it for data merging, and you can use update if you want to update a record that already exists.
SELECT * into table2 from table1
--table1 and table2 have the same table structure
INSERT into table2 select * from Table3
--table2 and Table3 have the same table structure
The situation becomes much more complex when data is imported and exported between heterogeneous databases. The first thing to solve is how to open a non-SQL Server database.
Two functions are available in SQL Server to open and manipulate these databases based on the OLE DB provider of various types of databases, both OPENDATASOURCE and OpenRowset. Their functions are basically the same, the difference is mainly two points.
(1) different ways of calling.
There are two parameters for OpenDataSource, namely OLE DB provider and connection strings. Using OpenDataSource is only the equivalent of referencing a database or a service (for databases such as SQL Server, Oracle, and so on). To refer to a datasheet or view in it, you must have the OpenDataSource (...) After the reference.
Querying table1 tables in an Access database Abc.mdb in SQL Server by OpenDataSource
SELECT * from OpenDataSource (' microsoft.jet.oledb.4.0 ',
' Provider=Microsoft.Jet.OLEDB.4.0;Data Source=abc.mdb; Persist Security
Info=false ') ...
Table1
OPENROWSET is equivalent to a recordset that can be used directly as a table or view.
Querying table1 tables in an Access database Abc.mdb in SQL Server by Openrowsete
SELECT * from OPENROWSET (' microsoft.jet.oledb.4.0 ', ' Abc.mdb ';
' admin '; ', ' SELECT * from table1 ')
(2) flexibility of different.
OpenDataSource can only open tables or views in the appropriate database, and can only be processed in SQL Server if filtering is required. OPENROWSET can filter the database while it is open, as in the example above, in OPENROWSET you can use the SELECT * from Table1 queries the datasheet in Abc.mdb, and OPENDATASOURCE can only reference table1, and cannot query table1. Therefore, OPENROWSET is more flexible than opendatasource.
2. Import Export data using command line bcp
Many large systems provide not only a friendly graphical user interface, but also a command-line approach to control the system. In addition to using SQL statements to manipulate data in SQL Server, you can use a command-line tool to do the same thing with BCP. BCP is a tool based on the Db-library client library. It is so powerful that BCP can significantly increase the load efficiency by replicating data from multiple clients to a single table in a parallel manner. However, when performing parallel operations, it should be noted that only applications that use an API based on ODBC or SQL OLE DB can perform operations that load data into a single table in parallel.
BCP can export data from SQL Server to a database supported by any OLE DB, as the following statement exports the authors table to an Excel file.
BCP pubs.dbo.authors out C:temp1.xls
-c-q-S "Gnetdata/gnetdata"-u "sa"-P "password"
BCP can be executed not only through the command line, but also through SQL, which requires a system stored procedure xp_cmdshell to be implemented, as the above command can be rewritten as follows.
EXEC Master.. xp_cmdshell ' bcp pubs.dbo.authors out
C:temp1.xls-c-q-s "Gnetdata/gnetdata"-u "sa"-P "password"
3. Import Export data Using Data Transformation Services (DTS)
DTS, which is the core of importing exported data in SQL Server, provides the flexibility to verify, purify, and transform data in scripting languages such as VBScript, JScript, and so on, in addition to the functionality that is appropriate for SQL and command-line tools bcp.
SQL Server provides a graphical user interface for DTS, where users can import and export data using a graphical interface and handle the data accordingly. At the same time, DTS also provides programming interfaces in the form of COM components, meaning that any development tool that supports COM components can take advantage of the functionality provided by DTS using the COM component. DTS can be saved in a different form in SQL Server, either in the form of a package or as a Visual Basic source program, so that DTS COM components can be used as long as you compile in VB.
The biggest difference between DTS and other data import methods is that it can be processed in depth for each row of data in the process of processing data. Here is a snippet of VBScript code, which executes at the time of each record in DTS, dtsdestination the target record, DTSSource represents the source record, and, when dealing with "marital status," converts 0 or 1 of the "marital status" in the source record to the target record. "Married "or" unmarried ".
Function Main ()
DTSDestination ("name") = DTSSource ("name")
DTSDestination ("age") = DTSSource ("Age")
If DTSDestination ("marital status") = 1 Then
DTSDestination ("marital status") = "Married"
Else
DTSDestination ("marital status") = "unmarried"
End If
Main = DTSTransformStat_OK
End Function
The above three kinds of data import and export methods have their advantages and disadvantages, they are compared to each other as shown in Figure 1.
Second, the performance of the comparison
Use Transact-SQL method. If you are importing exports between SQL Server databases, the speed will be very fast, but using the OPENDATASOURCE and OpenRowset methods is slower when you use OLE DB provider to open and manipulate the database.
Use the bcp command method. If you do not need to verify the data, and so on, it is very fast, because its internal use of the C interface Db-library, so the speed of the operation of the database is greatly improved.
Using DTS to guide data should be the best way to do it. Because it consolidates Microsoft Universal Data Access technology with Microsoft ActiveX technology, it is not only flexible to work with data, but also has a very high efficiency in data import and export.
Summarize
SQL Server provides rich data import and export methods, which gives us more choices, but this brings us to a new problem: How to choose the right data import and export method according to the situation? I am here to provide some personal advice, I hope to be able to play a certain role in guiding the reader.
If you are importing data into and out of a SQL Server database, and you do not need to perform a complex test of the data, it is best to use the Transact-SQL method for processing because it is very fast when data operations are performed between SQL Server databases. Of course, if you want to do complex operations, such as data validation, conversion and other operations, it is best to use DTS for processing, because DTS is not only efficient data, but also to the depth of the data control. However, the DTS programming interface is COM based, and the interface is complex, so using program invoke DTS will become very complex, so when the volume of data is not very large and you want to add data import and export to the program and there is no complex data processing capability, You can use OPENDATASOURCE or OPENROWSET for processing.
BCP commands are not well suited to be invoked by programs, and if you need to use bulk methods to guide data, you can invoke BCP commands through batch files, which means that you do not need to write a large number of programs, or that you do not have to import data through a variety of interface switches in Enterprise Manager. Therefore, it is better suited for situations where the client does not have an Enterprise Manager or uses SQL Server Express to quickly import and export data.
Note : More wonderful articles please pay attention to the triple programming Tutorial column.