How SQL tuning generates massive amounts of test data

Source: Internet
Author: User

scenario, if there is a slow SQL, DBA Plus index optimization is required, how do you know if the index is valid? This needs to be tested and adjusted over and over again, you can not directly take the online database test, the general method is to build a test table in the test environment, and then copy some data from the line from the library into the test environment, and then re-index and explain


But sometimes, the amount of data exported, the execution plan can not see the effect, the amount of data exported, and will scour the line of the machine's buffer pool and impact Io, if there is a tool to generate data directly, and generate the same as the line 1 million, or 10 million is good


Before the sysbench stress test, there was a function of generating data, generating 1 million data like this

Sysbench--TEST=OLTP--mysql-table-engine=myisam--oltp-table-size=1000000--mysql-socket=/tmp/mysql.sock-- Mysql-user=test--mysql-host=localhost--mysql-password=test Prepare

The resulting table structure is fixed, and the SQL statement for the stress test is fixed, and the SQL statements on the line cannot be debugged

CREATE TABLE ' sbtest ' (' id ' int (ten) unsigned NOT null auto_increment, ' k ' int (ten) unsigned NOT null default ' 0 ', ' C ' char (+) NOT null default ' ', ' pad ' char ($) NOT null default ' ', PRIMARY key (' ID '), key ' K ' (' K '));

Can there be a tool to create a user-defined table structure and generate millions data on the table structure? There is a tool called DataGen, link at the end of the article

drwxr-xr-x. 2 root mysql      4096 sep 27  2016 bizsqldrwxr-xr-x. 2 root mysql      4096 may 31 20:51 conf-rw-r--r--.  1 root mysql 23698092  Sep 27  2016 datagen.jar-rwxr-xr-x. 1 root mysql       147 sep 27  2016 datagen.sh-rw-rw-r--.  1 root mysql     31599 may 31 20:54 envbuilder.log-rw-r--r--.  1 root  mysql     1741 may 31 20:53 example.schema-rw-r--r--.   1 root mysql     1336 may 31 09:42 example.schema_ Backup-rw-r--r--.  1 root mysql     2062 Sep 27   2016 readme 

The method is simple 2 steps, put the table structure you want and how many data you want to generate, write to the Example.schema file, for example, if you want to generate 1 million data, add a comment at the end of the table/*{rc{1000000}}*/

CREATE TABLE ' test '. ' Tbl_test ' (' post_id ' BIGINT () ' Default ' 0 ', ' Star ' integer ' default ' 0 ', ' View_count ' Integer (1 1) Default ' 0 ', ' Bean ' Integer (one) default ' 0 ', ' nearby ' integer (one) default ' 0 ', PRIMARY KEY (post_id), INDEX (poster_ UID) collate= ' utf8mb4_general_ci ' engine=innodb DEFAULT charset=utf8mb4/*{rc{1000000}}*/;

2nd step, fill in the Connection test database account password, only need to add a line


<property name= "Oburl" value= "jdbc:mysql://data IP: Database port/database name? user= user name &amp;password= password"/>
Vi conf/datagen.xml <?xml version= "1.0"  encoding= "UTF-8"? ><beans xmlns= "http ://www.springframework.org/schema/beans "        xmlns:xsi="/http Www.w3.org/2001/XMLSchema-instance "        xsi:schemalocation=" http// www.springframework.org/schema/beans        classpath:org/ Springframework/beans/factory/xml/spring-beans-2.5.xsd ">             <bean id= "DataGen"  class= "Com.alipay.obmeter.tools.DataGen" >            <property name= "ObURL"  value= "JDBC: mysql://Data IP: Database port/database name? user= user name &amp;password= password "/>                              <property name= "InpuTddl " value=" Example.schema "/>           < Property name= "rowcountpertable"  value= "1000000"/>            <property name= "maxthreadcountpertable"  value= "/>   "         <property name= "Maxthreadcount"  value= "/>"            <property name= "DropTableFirst"  value= "true"/>           <property  Name= "Needfreeze"  value= "false"/>            <property name= "Staticratio"  value= "1.0"/>        < /bean></beans>

Then run the shell script, build the table into the test library, insert the data

[Email protected] datagen]#/bin/bash datagen.sh

[2017-05-31 08:53:15] [WARN] [datagen:184]-parsing DDL ... [2017-05-31 08:53:15] [WARN] [datagen:187]-Creating table ... [2017-05-31 08:53:15] [WARN] [multithreadpreparedatacomparator:508]-Preparing generators ... [2017-05-31 08:53:15] [WARN] [multithreadpreparedatacomparator:510]-generating Dynamic Data ... [2017-05-31 08:54:34] [WARN] [multithreadpreparedatacomparator:526]-Generate done.

In the test library, there will be 1 million data.

Mysql> Select COUNT (*) from test.tbl_test;+----------+|  COUNT (*) |+----------+| 1000000 |+----------+1 row in Set (0.16 sec)

Now you can index it, explain the real SQL statement on the line.

Mysql> explain select post_id  from test.tbl_test where post_type  <> 1 and check_status = 9 and flag = 1 and  post_time < 1496178301 order by post_time asc limit 200; \g+-- --+-------------+----------+-------+---------------+-----------+---------+------+--------+-------------+|  Id | select_type | table    | type  | possible_ keys | key       | key_len | ref  |  rows   | extra       |+----+-------------+------- ---+-------+---------------+-----------+---------+------+--------+-------------+|  1 |  simple      | tbl_test | range | post_time      | post_time | 9       | null | 501491 |  using where |+----+-------------+----------+-------+---------------+-----------+---------+------+--- -----+-------------+1 row in set  (0.00 sec) error: no query specified

Add index

Mysql> ALTER TABLE Test.tbl_test Add index Idx_f (check_status,flag,post_type,post_time); Query OK, 0 rows affected (4.45 sec) records:0 duplicates:0 warnings:0

One more explain, scan 500,000 lines to 2 lines.

Mysql> explain select post_id  from test.tbl_test where post_type  <> 1 and check_status = 9 and flag = 1 and  post_time < 1496178301 order by post_time asc limit 200; \g+-- --+-------------+----------+-------+-----------------+-------+---------+------+------+------------------------- -----------------+| id | select_type | table    | type   | possible_keys   | key   | key_len | ref   | rows | Extra                                      |+----+-------------+----------+-------+-----------------+-------+-------- -+------+------+------------------------------------------+|  1 | simple       | tbl_test | range | post_time,idx_f | idx_f | 15       | null |    2 | using where; using  index; using filesort |+----+-------------+----------+-------+-----------------+-------+- --------+------+------+------------------------------------------+1 row in set  (0.00  Sec

After you have debugged the index, determine that you can optimize the SQL and then add the index to the online environment.




And, of course, some very powerful features.

For example, a field that only has a specified number of values, such as the State Status field 0,1,2, and the probability of each state appearing

For example, the user UID on the analog line can limit the range of random number of a field, from 00000001 to 899999999, etc.

You can see the introduction of the Readme in detail


Baidu Link: Https://pan.baidu.com/s/1pKGQLkB Password: 6t4u

This article is from the MySQL DBA DaDa blog, so be sure to keep this source http://dadaman.blog.51cto.com/11373912/1931186

How SQL tuning generates massive amounts of test data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.