How the Collector optimizes SQL Computing (3) sequence operations

Source: Internet
Author: User

Cross-row references

Early SQL does not directly support cross-line references, and it is extremely cumbersome to generate serial numbers and then join. SQL after the introduction of the window function makes it easier to reference other row data, but the notation is still not concise, and the code is very long when there are multiple cross-line reference items. Moreover, as mentioned above, the window function is implemented on the basis of other operation result sets, and the re-reference of the computed value of window function is the form of sub-query, which is still cumbersome.

MySQL does not support window functions, but it supports the use of variables in SQL, which can be referenced to the preceding rows, but not to the subsequent rows.

The collector provides a convenient and natural cross-line reference syntax.

Each product monthly sales table structure is: product, month, sales; now we need to find a record of 10% more sales than last month.

A

1

=db.query ("Select * from sales Form order by product, month")

2

=a1.select (if (Product = = Products [1], Sales/Sales [-1]) >1.1)

After sorting, you can simply use [-1] to refer to the previous January data and can be filtered directly based on the cross-row computed value. The SQL window function uses a subquery, and MySQL defines two temporary variables.

Then calculate the moving average of sales for one months before and after each month in the table:

A

1

=db.query ("Select * from sales Form order by product, month")

2

=a1.derive (if (Product = = Product [-1]&& Product = = product [1], Sales { -1:1}.avg ()): moving average)

The calculated moving average involves a backward reference and a collection reference, with [1] referencing the next row of data, and { -1:1} can refer to a collection of field values from the previous row to the next row. Similarly, SQL window functions require subqueries to calculate the corresponding rows before moving averages, while MySQL's variables cannot be referenced in the back, it is difficult to calculate directly.

One more example, the simplified event table structure is: Serial number, moment, ... The time should be incremented with the serial number, but there may be errors, and you need to find the record where the time is not incremented with the ordinal.

A

1

=db.query ("SELECT * from event table order by ordinal")

2

=a1.select (Time!=max (Time {: 0}) | | Moment!=min (Time {0:}))

Comparison of all records before and after

The collection can also be taken from scratch or taken to the tail. SQL window functions also support a similar notation, but two comparisons should be done in two different directions of the sort, of course, the subquery must be used.

Ordered grouping

SQL provides only the order-independent equivalence groupings, but sometimes the grouped key values are not found in each record, but are related to the order of the records, in which case the SQL needs to use the window function (or other more cumbersome means) to produce the serial number.

The collector provides a grouping mechanism associated with the order, which is convenient for calculations related to continuous intervals.

The structure of the income and expenditure table is: month, income, expenditure; Find records of those months with consecutive losses up to March or more.

A

1

=db.query ("SELECT * from balance sheet order by month")

2

[Email protected] (Revenue > Expenditure). Select (~. Revenue <~. Expenses && ~.len () >=3). Conj ()

[Email protected] indicates that only adjacent records are compared when grouped, and a new group is separated if the neighboring values change. In this way, we can divide the records into profit, loss and profit according to the comparison of income expenditure 、... Such a group, and then take out the group with a loss and a member of not less than 3 to merge together.

Or this watch, hoping to calculate the longest continuous increase in revenue for several months. You can design such a grouping mechanism: When revenue increases and the month is divided into a group, when the income decreases, a new group is separated, and the maximum value of the group members is counted.

A

1

=db.query ("SELECT * from balance sheet order by month")

2

[Email protected] (Revenue < income [-1]). Max (~.len ())

[email protected] will separate a new group when the condition changes, i.e. when the revenue is reduced.

With the support of the window function, SQL can also implement this example and the above example, but it is very difficult to understand the wording.

Interval merging is also a common sequential grouping operation. There are fields in the interval table T with events: S (Start time), E (end time), and now to remove the overlapping portions of these intervals and then calculate the total length of time that the event actually occurred.

A

1

$select s,e from T order by S

2

=a1.select (E>max (e{:-1}))

Remove the included entries

3

=a2.run (Max (s,e[-1]): S)

Removing overlapping time periods

4

=a2.sum ([email protected] (max (s,e[-1]), E))

Calculate the total length of time

5

=a2.run (if (s<e[-1],s[-1],s): S) [email protected] (S;~.M (-1). E:E)

Merge time periods with overlapping

This gives a variety of target processing methods, taking full advantage of the characteristics of the cross-row operation and ordered grouping. SQL to implement this kind of operation simple with the window function has not been done, need to use a difficult to understand recursive query.

Location Access

For ordered collections, sometimes we need to access the members directly with an ordinal number. SQL uses a mathematically unordered collection concept to generate sequence numbers and then use conditional filtering to access members at a specified location, which can cause a lot of trouble for many operations.

The collector uses an ordered set mechanism that allows access to members directly with an ordinal, which is much more convenient.

For example, in the economic statistics commonly used to find the median in a number of prices:

A

1

[Email protected] ("Select price from the T order by Price")

2

=A1 ([(A1.len () +1) \2,a1.len () \2+1]). AVG ()

Locations can also be used for grouping. The event table structure is: Sequence number, moment, action, the action has the beginning, the end two kinds, now to count the total duration of the event, that is, the sum of the time between each pair of start and end.

A

1

[Email protected] ("Select Moment from event table order by moment")

2

=a1.group ((#-1) \2). sum ([email protected] (~ (1), ~ (2))

#表示记录序号, Group (#-1) \2 the data into a group of two, and then totals for each group of compute hours.

Adjacent cross-row references can also be made based on location. The structure of the stock price table is as follows: Trading day, closing price, and now the trading day and the daily increase of the stock price over $100.

A

1

=db.query ("SELECT * from stock list ORDER by trading day")

2

[Email protected] (closing price >100). Select (~>1)

3

=a2.new (A1 (~). Trading Day: Trading day, A1 (~). Closing price-a1 (~-1). Close Price: Gain)

The Pselect function returns the position of the member that satisfies the condition, which makes it easy to calculate the gain without having to calculate all the gains and filters in advance, as with the window function.

---restore content ends---

How the Collector optimizes SQL Computing (3) sequence operations

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.