MySQL Cookbook 8th Chapter reading notes

Source: Internet
Author: User

1, using the Count function to generate the summary

The function function is easy to use, and the COUNT (*) statement is quick for the MyISAM table, but for BDB or InnoDB tables, avoid using it as much as possible, because the statement requires a full scan and is slow, and the solution is from Information_ Number of table_rows extracted in schema database

Use with the IF statement:

Create a view to simplify the usage summary (when using summaries frequently):

Working with Views:

With min (), Max () is similar:

Similar to the sum () and AVG () functions generate a summary:

2. Use the distinct function to eliminate duplicates

3, find the maximum child and minimum values related to the value

The cluster functions such as min () and Max () cannot be used in the WHERE clause, but we will have a numeric value similar to the maximum population, and the corresponding city name can be stored in a user-defined variable by the maximum population value, and then the data row will be compared to the value of the variable to solve the problem:

There is another way to pick a different data column from a data row that contains a minimum or maximum value: Use a connection, place the selected data in another table, and then match the table with the original table to the data row of that numeric value.

4, dividing the digest into subgroups

To more accurately find out how much information each sender sends from each host, you should use two columns of data to divide the subgroups.

The count here is the number of words each sender (SrcUser) sends from each host (Srchost).

Sub-groups The data rows in the mail table against SrcUser, and then displays the size of the largest message sent in each subgroup and the size of the smallest message

A recurring problem: When group by is used to divide subgroups, the only clauses that can be selected are the clauses that divide the subgroups and the values of the summaries computed from the subgroups.

Red is marked as an error because the subgroup is divided into name, and the presence of trav_date is not a clause in the subgroup and a digest value computed from the subgroup, and the data rows that are associated with the minimum maximum subgroup values are typically used to introduce the connected technology:

5, select the group using the identified attributes

The difference between where and having is that the having operation is able to carry out additional conditional constraints on the cluster subgroups on the set of data that have been selected and partitioned into subgroups.

Of course, you can also use aliases:

Use having and count to determine the uniqueness of a record within a subgroup:

can also be used with union values. For example, in order to find a Send/receive pair that only sends a message, you can look for a union that only occurred once in the mail table:

You can also use expression results to group:

6, category no category data

For example, in the absence of duplicate data, such as:

There is no duplicate number, so it is not good to use group by to divide the subgroup.

Divide the population into 5个百万 levels:?????

7, Control summary display order

To sort by the name of the molecule group, in days to travel:

8, find the minimum or Maximum summary value

Min (), Max () cannot be used as a parameter to other clustering functions. For example: You can easily find the total mileage of each driver:

does not work properly,

However, you can sort the data rows first, and then use limit to select the first row of data:

9, generate reports that include summaries and lists

Use Pthon to generate summary and list reports:

+ View Code?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 import Cookbook import sys import MySQLdb name_map={} conn=Cookbook.connect() cursor=conn.cursor() cursor.execute("""             select name, count(name),sum(miles)             from driver_log group by name             """) for(name,days, miles) in cursor.fetchall():     name_map[name]=(days,miles) # 选择每个司机的行程数并打印报告,显示行程列表中每个司机的摘要汇总项 cursor.execute("""                 select name, trav_date, miles                 from driver_log group by name, trav_date                 """) cur_name="" for (name, trav_date, miles) in cursor.fetchall():     if cur_name !=name:         print "Name: %s; days on road: %d; miles drien:%d "\                     %(name, name_map[name][0],name_map[name][1])         cur_name=name     print "date:%s, trip length:%d "%(trav_date,miles)     cursor.close()

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.