Exercise caution when importing and exporting large data volumes in SQL Server

Source: Internet
Author: User
Tags bulk insert
During database initialization, an actual problem that administrators need to face is how to import large amounts of data into the database system. Some large-capacity data import and export tools are provided in the SQL Server database for the database administrator to use. For example, the administrator can use the BCP utility to export large data volumes, import large data volumes, and generate formatted files. For example, you can use the bulk insert statement to directly import large data volumes from data files to database tables or unpartitioned views.

These utilities are provided in the SQL Server database. However, administrators still need to be cautious about the work. Specifically, pay attention to the following aspects when importing and exporting large data volumes.

1. Try to use the tools provided by the system to import and export large data volumes.

The several utility tools mentioned above share a common feature, that is, they will optimize large data volumes during the import process. For example, it standardizes some data formats and saves time for import and export. However, these tools are different from other common data import and export tools, that is, large-capacity import operations do not support importing data into comma-separated files, that is, commonly referred to as text files. At this time, the administrator can use other tools to import large-capacity data in the file format, but I do not recommend this. Other tools do not support optimization during the import process. For this reason, I suggest using other tools, such as access, to first convert the data in text files into common forms, and then use these tools to import the data to the system. Although this will increase the workload, it can ensure the quality of large-capacity data. Therefore, in my practical work, if users encounter such problems, I strongly recommend that they use the tools provided by the system to import and export large-capacity data.

In addition, it is also a good choice to improve the standardization of large-capacity data by using formatted files. The aforementioned large-capacity Import and Export tools support the use of specialized formatting files to store the format information of each field in the original data file. Formatting files can also contain information about the corresponding database tables. A formatted file can be used to provide all the format information required to export data from a database instance in large capacity and import data to it in large capacity. In layman's terms, formatting provides a flexible way to interpret the data format in the data file during the import and to set the data format in the data file during the export. This flexibility makes it unnecessary to write a dedicatedCodeAnd does not need to meet the needs of databases or external applicationsProgramTo reset the data format. If you use formatting files flexibly, you can export or import large data volumes in the desired format without any additional format adjustments.

2. select an appropriate database log Operation Mode

As you know, any changes made by users in the database will be recorded in related logs. Importing and exporting large data volumes is no exception. However, because the large data capacity is large, it will occupy a large transaction log function. Therefore, we recommend that you select an appropriate database log operation mode before importing large data volumes. The author's practice is that if you need to import large-capacity data, it is best to select the large-capacity log recovery mode. After the import is complete, restore to the original mode.

This is mainly because in the large-capacity log mode, the support for large-capacity data import is relatively good. Compared with other Log Recovery modes (such as full recovery mode), the large-capacity log recovery mode only records the minimum number of large-capacity operations. Therefore, the large-capacity log operation recovery mode ensures that large-capacity operations are not affected by hardware faults. This mode provides better performance and minimizes the space occupied by logs. Therefore, the use of large-capacity Log Recovery helps prevent insufficient space in transaction logs, because large-capacity log recovery will not insert log lines. This large-capacity log operation mode is very suitable for databases that use the full recovery mode. The large-capacity log recovery mode is very useful when performing large-capacity operations on non-index tables.

However, the large-capacity log operation mode also has certain risks. For example, the large-capacity log recovery mode increases the risk of data loss for these large-capacity replication operations. Because the large-volume log operation mode will prevent the database system from capturing changes made to each transaction one by one. If a log backup contains a large log volume, the log backup cannot be restored to the time point in the log backup, but the entire log backup can only be restored. In the large-capacity log recovery mode, if the log backup overwrites any large-capacity operations, the log backup will contain the log records and data pages changed by the large-capacity operations. This is critical to the result of capturing large-capacity log operations. The merged data zone makes log backup very large. In addition, the backup log needs to access data files that contain large volumes of log transactions. If you cannot access any affected database files, the transaction log cannot be backed up and all operations committed in this log will be lost. Therefore, the large-capacity log backup mode is not a secure log mode.

Iii. Check whether you need to temporarily Delete the table Index

Cited as a special file, it plays an important role in the database. Simply put, if we compare a database to a book, the index is like a book directory. An index contains reference pointers to all records in a data table. Indexes can improve database performance, which is beyond doubt. However, indexing does not play a positive role in any scenarios. In some special circumstances, it will reduce the performance of some operations, such as the import of large data volumes.

The index can accelerate data retrieval, But it slows down data modification. Because each time you modify or insert a data record, the index must be refreshed once. That is to say, if 1 million records are inserted, the index must be updated for 1 million times. It can be seen that when large data is imported, indexes consume a large amount of database resources, thus reducing database performance. If the target table has an index, it will not only affect the high-capacity data import speed of the database, but also reduce the normal access performance of other users to the database.

For this reason, the author suggests that if there are not many data in the table to be imported, it is best to delete the index first to Improve the Performance of large-capacity data import. After the import, re-enable the index. However, if a large amount of data already exists in the table to be imported, and the data to be imported may be similar to or less than the existing data, you do not need to delete the index. In this case, deleting an index is counterproductive. Because it may take a longer time for the database system to re-generate the index than during the large-capacity import operation. In this case, the Administrator will not be able to delete the index of the target table.

Iv. Instantly back up the database after data import

Just like creating a database object, after importing large data into the database system, the Administrator must back up the existing database in time. Because of the timely and systematic help of large-capacity import tools, the data import work is still very tedious and time-consuming. Therefore, when large data volumes are successfully imported to the database system, the Administrator must back up the database in a timely manner. Here, I would like to remind you that the backup methods are often different in different operation log modes.

After large-capacity data is imported, the administrator needs to back up the database. The author suggests that if the administrator adopts the simple log recovery mode at that time, the administrator should immediately perform full backup or differential backup after the large-capacity import operation is complete (if time permits, it is best to back up the data completely ). However, if the database administrator used the large-capacity log recovery mode or the full recovery mode at the time, if the time was not long or he was afraid that the full backup would affect the access of the current user, you only need to perform log backup. If the database server has not yet become a production server (that is, there are no users in use), it is better to fully back up the database.

5. Common Errors

There are about two Common Errors During the import of large-capacity data.

First, the file format provided is incorrect. As mentioned above, the large-capacity import tool provided by the database generally does not support text files. For this reason, the administrator needs to perform a conversion in advance. Second, you must note that hidden characters may cause problems. Many software and text editors show hidden characters. These hidden characters are usually at the end of the data file. During the large-capacity import operation, hidden characters in the data file may cause unpredictable problems, such as space character errors. This error is avoided. The database administrator can search for and delete all hidden characters before importing data. In fact, this problem occurs not only in the process of importing large data volumes, but also in the process of importing a small amount of data.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.