How MySQL tens data is optimized

Last Update:2016-06-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MySQL Tens data how to do optimization
This post was last edited by Demofere on 2015-02-11 13:56:31


function Insert () {
Global $m;
if (!isset ($_request[' STRs ')) | |!isset ($_request[' type '])
|| !isset ($_request[' hash ')) {
echo ' param error ';
Return
}

STRs for all strings
$poststr = $_request[' STRs ');
        
$xstrs = Json_decode (Stripslashes ($POSTSTR), true);
        
$type = $_request[' type '];
$hash = $_request[' hash '];

if (count ($xstrs) <= 0) {
$msg = ' str error ';
Dslog::errlog ($msg. $poststr);
Echo $msg;
Return
}

if ($type = = ' 0 ') {
$table = ' white ';
}
else if ($type = = ' 1 ') {
$table = ' black ';
}
else{
$msg = ' type error ';
Dslog::errlog ($msg);
Echo $msg;
Return
}

$strs = Array ();
        
for ($i = 0; $i < count ($xstrs); $i + +) {
$strtmp = $xstrs [$i];
$strtmp = Trim ($strtmp);
$strtmp = Strtolower ($strtmp);
$strtmp = Addslashes ($strtmp);
if (strlen ($strtmp) > 256) {
$strtmp = substr ($strtmp, 0, 256);
}
if (strlen ($strtmp) >= 7)
{
Array_push ($strs, $strtmp);
}
}
        
        
Stitching all strings
$tmp = ' ('. Implode (', ', ' ', $strs). ') ';

Get string that already exists
$sql = "SELECT * from $table the where str in $tmp";
$ret = mysql_query ($sql, $m);
if (! $ret) {
$msg = ' EXEC error: '. Mysql_error ($m). ', '. $sql;
Dslog::errlog ($msg);
Echo $msg;
Return
}

$exists = Array ();
$notexists = Array ();
$count = mysql_num_rows ($ret);
for ($i = 0; $i < $count; $i + +)
{
$item = Mysql_fetch_assoc ($ret);
if (! $item) {
Break
}
Array_push ($exists, $item [' str ']);
}

for ($i = 0; $i < count ($strs); $i + +) {
if (In_array ($strs [$i], $exists)) {
Continue
}

Array_push ($notexists, $strs [$i]);
}

for ($i = 0; $i < count ($exists); $i + +) {
$exists [$i] = Addslashes ($exists [$i]);
}
for ($i = 0; $i < count ($notexists); $i + +) {
$notexists [$i] = Addslashes ($notexists [$i]);
}

if (count ($exists) > 0) {
Update the Count field for a string that already exists
$tmp = ' ('. Implode (', ', ' ', $exists). ') ';
$time = Date (' Ymdhi ');
$sql = "Update $table set count=count+1 where str in $tmp";
$ret = mysql_query ($sql, $m);
if (! $ret) {
$msg = ' EXEC error: '. Mysql_error ($m). ', '. $sql;
Dslog::errlog ($msg);
Echo $msg;
Return
}

Update the UPD field for a string that already exists
$sql = "Update $table set upd= ' $time ' where str in $tmp";
$ret = mysql_query ($sql, $m);
if (! $ret) {
$msg = ' EXEC error: '. Mysql_error ($m). ', '. $sql;
Dslog::errlog ($msg);
Echo $msg;
Return
}
}


Insert new information
if (count ($notexists) > 0) {
$time = Date (' Ymdhi ');
$sql = "Insert ignore into $table (Str,hash,count, upd) values";
for ($i = 0; $i < count ($notexists); $i + +) {
$str = $notexists [$i];
$CRC = sprintf ("%u", CRC32 ($STR));
$sql. = "(' $str ', ' $CRC ', ' 1 ', ' $time '),";
}

$sql = substr ($sql, 0, strlen ($sql)-1);
$ret = mysql_query ($sql, $m);
if (! $ret) {
$msg = ' Insert error: '. Mysql_error ($m). ', '. $sql;
Dslog::errlog ($msg);
Echo $msg;
Return
}
}

Echo!! $ret;
}

I'm now going to count the strings, count the number of each string, just like a map, the string: The number, of course, the database also saved a bit of the CRC and other information in the character
The Insert function receives the post-STRs, which is an array of strings, with an average of 1500 strings per post.
Now the records in the database 1200W, each time the insert takes an average of 20 seconds (often 30 seconds timeout)
I would like to ask you, how to do optimization, now only a small part of the statistics are expected to complete the data volume of about 11 billion
------to solve the idea----------------------
Can you make sure that the bottleneck is on the 3 statements that the database makes for querying and modifying?
You said that the average number of strings that came in each post was 1500.
Does that mean 1500 of the words? If not (string length) then the average word is 20 characters, and there are 75 words
Your filter condition is str in $tmp (' xxx ', ' xxx ' ....) This is the 1500 (75) string comparison of each record in the table, and the hit rate is at most 1/1500 (1/75) Do you think it is appropriate to do so?
The strength of the database is the comparison between records, and you're doing a weaker column comparison
You can use the incoming data to construct a temporary table (one word per line) and then associate the operation with the primary table. So you can take advantage of the database.

Again, your 2 consecutive update statements cannot be merged together?
------to solve the idea----------------------
1, MySQL has memory table, you do not use
2, the two update merge, should be able to reduce 10 seconds
------to solve the idea----------------------
Mysql_error the efficiency of this library function is not high bar and the official is not recommended, I have almost no use of this, I learned PHP, directly started PDO.

Also, you use the keyword in this, as if it is not going to index it.
------to solve the idea----------------------
Citation:

feels almost the same

It should be your test environment problem, the table query in most cases is faster than in
The results you've measured are probably the following reasons.
1. The data of the query table does not reach a certain amount
2. Two different query methods isolate the result set
3. Query fields are not indexed or not added
4. Affected by too many other indexes
5. Other reasons
I think the obvious is the 2nd reason.
You did not specify a query field when you linked the table, the field must be more than the table is not connected
There will be one-to-many queries on the table, so that a record will become multiple, plus GROUP by



This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

How MySQL tens data is optimized

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

How MySQL tens data is optimized

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support