Simplifying SQL for multi-layer Grouping association calculations

Source: Internet
Author: User

in the development of database application, we often need to face complex SQL computation, such as the correlation computation in multi-layer grouping. Because SQL grouping must be aggregated at the same time and object-related access is not possible, dealing with such problems can be complex and can only be accomplished with advanced techniques for nesting multi-layered subqueries with window functions. The collector supports true grouping, intuitive object-related access, and easier resolution of such problems.
The Grouping Association encounters more in the actual business, for example http://forums.bit-tech.net/showthread.php?t=207052. The following is a more general example of the actual business in the link to illustrate the specific process by which the collector implements the Grouping association.
Calculation Objective: To find out the missing DVD store, that is, the existing DVD copy less than 4 categories of stores.
Data structure: Branch table, store DVD branch information. DVD table, storage DVD title and classification information, DVD is virtual data, such as "Transformers 4" is a DVD, but it is not a visible disc. The Dvdcopy table, which stores multiple copies of the DVD, is a true CD-ROM and is stored as an entity in each branch. Note: The Dvdcopy table is associated with the Branchid field and the branch table, with the Dvdid field and the DVD table. Here's some data:

Branch table:


DVD Table:


dvdcopy Table:


Description:
(1) The result of the calculation should be some records in the branch table.

(2) If the Status field in the Dvdcopy table is "Miss", the disc is missing. If the lastdatereturned field is empty, the disc loan has not been returned. Obviously, lost or not returned discs are not within the scope of calculation and should be filtered out.

(3) It should be considered that some stores may not have records in the Dvdcopy table, although this situation is relatively rare.

Ideas :
(1) From the Dvdcopy table to filter out a valid, store existing DVD copy.

(2) The Dvdcopy table is grouped according to bid, and each group is a copy of all DVDs in a store.

(3) Find the DVD copy corresponding to each store, and then calculate the number of these DVDs.

(4) Find out the existing DVD classification of less than 4 stores, such stores meet the requirements.

(5) Find the stores that have not appeared in the Dvdcopy table, such stores also meet the requirements.

(6) Merge the two types of stores that meet the requirements.

The Collector code :

A1-A3: Retrieves data from the database, named Variable branch, DVD, Dvdcopy, respectively. The results of the calculation are as follows:


A4: Switch the Dvdid field in the Dvdcopy table to the corresponding record in the DVD table and switch the bid field to the corresponding record in the branch table. Note: This step is the basis for object-based association access and requires the use of the switch function. After calculation, the dvdcopy is as follows:


    A blue font indicates that the field corresponds to a record, which can be viewed after a click, such as:


At this point, just use the operator "." You can do object-related access, such as dvdcopy. (Dvdid). (category) indicates the category of DVD corresponding to each DVD copy. Dvdcopy. (BID), you can get the store details for each DVD copy (full record).

A5:=dvdcopy.select (status!= "Miss" && lastdatereturned!=null)

This code is used to filter the data, namely: Lost, non-return DVD copy is not in the calculation range, filtered after the value of A5 as follows:

A6:=a5.group (BID)

The code above is used to group the data in A5 by bid, and each row represents all DVD copies of a store, as follows:

    Click the blue font to see the members in the group:


As you can see, the function group only groups the data and does not need to summarize the calculations at the same time, unlike the grouping functions in SQL. Sometimes we need to do more in-depth processing of the grouped data, rather than simply summarizing it, then it is more convenient to use the group function of the collector, such as the code in A7.

A7:=a6.new (~. Bid:bonlist, ~. (DVDID). ID (Category). Count (): Catcount)

The code above is used to calculate the corresponding DVD copies of each store in several categories. Function new can generate a new object based on the data in A6 A7,a7 has two columns: Bonlist and catcount,bonlist are directly from the bid column of the data in the A6, catcount the Dvdid column from the data in the group. Catcount's algorithm is divided into three parts: ~. (Dvdid) Find DVD records for all DVD copies in each store; ID (Category) removes duplicate category;count () from these DVD records to calculate the number of Category. The calculation results are as follows:

that is: B002 store has 3 types of DVD copies, B003 stores have 3 categories, B001 stores have 4 categories.

A8:A7.select (CATCOUNT<4)

The above code executes the query, find Catcount less than 4 stores, the result is as follows:

The above-listed stores are calculated according to the dvdcopy table. However, some stores that are badly out of stock may not appear in the Dvdcopy table, for example, all DVD copies of the store are borrowed, or the store does not have a DVD copy at all, so merge this part of the store in the following code:

A9:=a8. (bonlist) | (Branch \ A7. (bonlist))

In the preceding code, the operator ' | ' Represents a union calculation of two datasets (which can be substituted with the unions function), and the operator "\" for the differential set calculation (available function diff instead). A8. (bonlist), Branch, A7. (Bonlist) Represents: Dvdcopy table in stores, all stores, dvdcopy table in the presence of stores, the values are:

A9 is the final result of this case, the value of which is:


You can also use the indirect method to achieve this case, such as: First calculate the "store not out of stock", and then make a difference with the branch table, the calculation results should be the same as A9.

It is important to note that SQL lacks explicit collections and cannot represent datasets with variables such as A8 or branch, so the short code above must be implemented with a few lengthy SQL.

In addition, the collector can be called by the Java program, the call method is similar to the normal database, using the JDBC interface it provides to the Java main program to return the ResultSet form of calculation results, specific methods can refer to the relevant documents.



Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Simplifying SQL for multi-layer Grouping association calculations

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.