Tpch data generation and import Summary

Source: Internet
Author: User
Qgen usage

Qgen is the generator that generates 22 query statements. During execution, 22 SQL templates and dbgendists. DSS dictionary files under dbgen queries are required.

First, build the project qgender based on the same region as dbgen, and move the 22 SQL query templates and dbgendists. DSS dictionary files under dbgenqueries to the new folder to_query.

Then open CMD and CD to the to_query directory and run the following command:
Qgen-D 1> d1. SQL

Command explanation:-D indicates that the default format SQL statement is generated. 1 indicates that the 1st templates under dbgenqueries are used.> d1. SQL redirects the console output to d1. SQL in the current directory using the named pipeline.

Finally, there are still some problems with the generated SQL statements, and some modifications are required. Remove the set rowcount-1 go statement at the end, change the substring function to the substr function, and remove the as keyword before the table alias, move the column name after the alias of the subquery to the select clause of the subquery.

A total of 22 default SQL files need to be generated according to this format, and then set timing on in sqlplus to enable the timer. @ the generated SQL file for testing

If you want to force parallel query, set it in sqlplus
Altersession force parallel query

The query optimizer uses parallel query no matter whether or not parallel query is enabled for the original table.

Sqlldr supplement

First, sqlldr can be loaded in two modes: Regular paths and direct paths. The former must convert data into insert statements and load data in SGA, the latter directly writes data in the memory to form the data block format of the database, avoiding the overhead of statement interpretation and logging. Therefore, when importing large amounts of data similar to a data warehouse, direct path loading is generally used.

In addition to insert, the loading method can also be append, replace, and truncate. To execute insert, make sure the table is empty. Otherwise, sqlldr reports an error and cannot continue execution. To add a record to a table, you can specify the loading option as append. to replace the existing data in the table, you can use replace or truncate. Replace uses the delete statement to delete all records. Therefore, if the table to be loaded already contains many records, this operation is very slow. Truncate uses the truncate SQL command for faster execution because it does not have to delete each row physically. However, truncate cannot be rolled back. Be careful when setting this option. Sometimes other parameters also affect this option.

Finally, if the same table has multiple external data files, you can increase the loading speed by setting parallel parameter = true and using parallel loading. For example:
Sqlldr Scott/tiger control = lineitem. ctldirect = true parallel = true content from 17 jquery

Note that the parallel parameter only indicates that multiple sqlldr processes can be loaded at the same time, rather than using parallel methods for the current statement. That is to say, a sqlldr command can only be loaded serially.

Windows OS does not support background processes with syntax. However, you can open multiple cmd windows and execute multiple different sqlldr statements to achieve the same effect. It should be noted that the I/O capability of the server has a huge impact on loading. If the I/O bandwidth for read/write is already full, sqlldr is actually waiting for I/O to complete, starting multiple sqlldr instances at the moment will not improve the loading performance.

Use exp/imp

Import (IMP)/Export (exp) is the oldest two operating system command line tools survived by Oracle, exp/imp is a good dump tool, especially in the dump of small databases, table space migration, table extraction, and detection of logical and physical conflicts are among the greatest benefits. As a logical secondary backup after physical backup of small databases, it is also a good method. EXP/IMP becomes increasingly inadequate for larger databases, especially TB-level databases and more data warehouses. At this time, database backup is switched to RMAN and third-party tools.

In short, exp and IMP can export data from Oracle databases and tables into files, which can be called backup and can be transplanted between multiple computers. For example:
Exp Scott/tiger @ orcl file = D: expfile. dmp tables = (Part) 17jquery.com

IMP Scott/tiger @ orcl file = D: expfile. dmp tables = (Part) KEYWORDS: TPC-h, Oracle TPC-H data generation and
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.