How to insert random string data in mysql, mysql string

Source: Internet
Author: User

How to insert random string data in mysql, mysql string

Application scenarios:
Sometimes it is necessary to test the records inserted into the database, so these scripts are very necessary.

Create a table:

CREATE TABLE `tables_a` (  `id` int(10) NOT NULL DEFAULT '0',  `name` char(50) DEFAULT NULL,  PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Create a function that generates a random string:

set global log_bin_trust_function_creators = 1;DROP FUNCTION IF EXISTS rand_string;DELIMITER //CREATE FUNCTION rand_string(n INT)RETURNS VARCHAR(255)BEGIN        DECLARE chars_str varchar(100) DEFAULT 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';        DECLARE return_str varchar(255) DEFAULT '';        DECLARE i INT DEFAULT 0;        WHILE i < n DO                SET return_str = concat(return_str,substring(chars_str , FLOOR(1 + RAND()*62 ),1));                SET i = i +1;        END WHILE;        RETURN return_str;END //delimiter ; 

Create a procedure for inserting a table, starting with the number of x. How many random numbers are generated by z and how many random numbers are generated by z?

delimiter // create procedure test(x int(10),y int(10),z int(10))begin  DECLARE i INT DEFAULT x;  while i<y do insert into tables_a values(i,rand_string(z)); set i=i+1; end whi

Mysql random data generation and insertion

Dblp databases only reference a small amount of information, with an average of 0.2 articles referenced in one paper. Reference information can be randomly added as mentioned in a paper on the experiment dataset using dblp. Inspired by this, I intend to add 20 random references to each paper, so I wrote the following SQL statement:

String SQL = "insert into citation (pId1, pId2) values (select pId from papers limit ?, 1), (select pId from papers limit ?, 1 ))";

Use preparedstatement to submit the database in batch mode.

The first parameter is the rowid of the paper, from 0 ~ N (N is the total row of papers ). The second parameter is the 20 non-repeated random numbers generated by Java. The value range is 0-N. The data is then nested in the for loop, and each pieces of data is submitted to the database once.

This code cleverly uses the limit feature to complete random tuple selection, which was originally a secret. I thought that all the select statements were handed over to the database, saving the need for multiple jdbc connections, which should be completed quickly. It takes up to 22 minutes to insert 10 million (10000*10) data records. The final experiment needs to insert million pieces of data, that is, it takes about 14 h.

As a result, I began to reflect and constantly write similar programs to find the time bottleneck, and finally locked in select limit. This operation consumes a lot of time. When limit was selected, the reason was that the number is randomly generated, and the number needs to be mapped to tuple, that is, corresponding to rowid. Because the primary key of the papers table is not an incremental int, therefore, the default rowid does not exist. Later, you can add an auto_increment temp column in the papers table, and then delete the citation column. In this way, the SQL statement is changed:

String SQL = "insert into citation (pId1, pId2) values (select pId from papers where temp = ?), (Select pId from papers where temp = ?)) ";

Insert 10 million data records again, which takes 38 s. The efficiency is greatly improved, but I don't know if further optimization is possible.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.