Generally speaking, the SQL language is a disorderly operation. If you want to perform an orderly process, such as comparing the front and back of a sequence, you must use a cursor. However, in some cases, another method can be used, without cursors, to handle the ordered information, which is unequal joins. Let's look at one of the following examples
A while ago, csdn Netizen BuildIT letter, and I discussed such a question: The following table history
CREATE TABLE [HISTORY] (
[Thedate] [DateTime] Null
[Quantity] [INT] Null
) on [PRIMARY]
is stored in a series of historical data, such as:
INSERT HISTORY VALUES (' 2002-01-01 00:00:00.0 ', 11)
Go
INSERT HISTORY VALUES (' 2002-01-02 00:00:00.0 ', 34)
Go
INSERT HISTORY VALUES (' 2002-01-03 00:00:00.0 ', 27)
Go
INSERT HISTORY VALUES (' 2002-01-04 00:00:00.0 ', 43)
Go
Now, we want to query the total from the start date to each date. In other words, display such a result set:
Thedate Quantity Q_sum
2002-01-01 00:00:00.0 11 11
2002-01-02 00:00:00.0 34 45
2002-01-03 00:00:00.0 27 72
2002-01-04 00:00:00.0 43 115
Intuitively, we can build a cursor on the SELECT * from HISTORY ORDER by Thedate, starting with the first, each, plus once. What about a different idea? If we create such a result set, let each day period correspond to the number of its day and the number of records before it. Then we can group by this date and sum the quantity. It is obvious that an unequal query has been formed. My original writing is wrong, the following is the BuildIT modified final statement
Select L.thedate,
L.quantity,
SUM (r.quantity) as Q_sum
From HISTORY l
Join HISTORY R
On L.thedate >= r.thedate
Group BY L.thedate, l.quantity
ORDER BY L.thedate
The unequal join itself is not a one by one correspondence, and its corresponding relation and order are closely related. That's why we can use it for orderly operation. One more natural example:
SELECT l.i, SUM (R.I)
From N L
JOIN N R
On L.i >=r.i
GROUP by L.i
Table n has only one integer column I, which preserves the natural sequence. So, there is no mystery, this is the summation of the natural sequence. Here sum (R.I) represents the cumulative sum of the natural sequence n from 0 to I, which is simpler than the previous problem. But obviously this is not the place to play the power of unequal joins, because it creates a huge triangular dataset, like the following
1 1
2 1
2 2
3 1
3 2
3 3
...
When I performed this query on a list of 16-bit integers, my athlonxp1700+/256mddr machine ran for nearly 30 minutes, and when I wrote this text, it returned a data overflow error. Obviously, even a list of 16-bit integers is too large for this query. My advice is to use an unequal join only if the result set cannot be expressed in formulas. Like this cumulative, we already have a mature formula, why let the computer stupid calculate it? Use the following statement
SELECT I, ((1+i) *i)/2
From N
Compared to the honest accumulation, the speed is extraordinary fast. found that the data overflow, not even a second, but this computer is not able to think of this method, alas ...
Legend of a generation of the Master of Mathematics in Gauss Elementary School, his teacher tested him this question. So almost all the Chinese schoolchildren were tortured by the teacher with this problem. It seems that the purpose of the teachers is to tell us that our IQ is not as Gauss. But I did not want to compare with others ah ...
In college, teach us the first "mathematical analysis," the teacher said the computer is a fool, I just thought it was fun, today is seen, it seems that in the capacity of summing up, the computer is my primary school level, will never catch up with Gauss on primary school.
However, this kind of thing is useful in the place where the formula is difficult to express. For example, a friend of mine wrote a prime sieve with an unequal join, which is interesting. Although it will not be more efficient than the program we write with procedural code, but it can express the essence of sieve method, perhaps we will study number theory, the use of this kind of SQL style notation. The friend taught me a lot about computer science and I wouldn't copy his code out of respect for him. However, the statement itself is not complex, I believe that friends think of using the join query, must be written out, we are interested, you may wish to try. It can also be used to achieve some other series, we will discuss a few later.
An unequal join also has a usage that can be used to generate an ordinal column, such as
SELECT COUNT (L.afield) as ID,
L.afield
From MYTABLE L
Jion MYTABLE R
On L.afield > R.afield
GROUP by L.afield
The afield field can be a string, a date, or, of course, a numeric value, which can be sorted anyway. This thing a bit Chine taste, the amount of data is too large, it is not fun, generally or with the physical line number is good, although not the SQL standard, but practical ah. I have seen this example in MCDBA's review (which is said to have been tested), but my friend has made it himself, and you may have achieved it independently.
The orderly operation of an unequal query is clearly derived from the sortable and reciprocal of the joined fields, so it is best not to do unequal joins on fields with duplicate values (in fact, it is best not to make any joins on fields with duplicate values, unless you are quite sure what you are doing). The data explosion of the equivalent join is terrible enough, if the unequal connection is blown up ... Hey hehe ...
Imagine a pair of duplicate values in an equivalent join, which may result in two pairs of duplicates. But if unequal joins, it is related to the position of repetition. Because this is a triangle, so it appears on the top of the all right, if it appears in the lower part of the triangle ...
An unequal join query is clearly a powerful tool, but it is also one of the shortcuts to getting into trouble. There are several suggestions that are my experience:
If the join generates a large "triangle", don't use it, try a subquery or even a cursor;
The resulting set of results is smaller than the original table, as much as possible to filter out the useless data first;
Using unequal joins for sequence calculation will express very clearly (because is not the process of), but usually in the efficiency of it does not have any advantages, so, usually play can, really use words best consider first;
Also, unequal joins are not easily used in multiple joins, or they may cause leverage.
I wish you all a pleasant journey in this magical world!
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.