Although the person who is proficient in SQL will not have any questions about NULL, but the whole article is still very difficult to find, see an English version, feel good.
Tony Hoare invented the null reference in 1965 and considered it a "billions of dollar mistake" he had made. Even today, 50 years later, a null value in SQL is the culprit for many common mistakes.
Let's look at the most shocking cases together.
Null does not support size/equality judgment
The following 2 queries, regardless of how many records are in the table users, return a record of 0 rows:
1234 |
select * from users where deleted_at = null ; – result: 0 rows select * from users where deleted_at != null ; – result: 0 rows |
How could it be like this? Everything just because null is a type that represents an "unknown". In other words, it is meaningless to compare null with other values using the regular comparison operator (normal conditional operators). Null is also not equal to NULL (approximate understanding: The unknown value cannot equal the unknown value, the relationship between the two is also unknown, otherwise the math and logic will be messed up).
– Note : The following SQL is suitable for MySQL, if it is Oracle, you need to add ... from dual;
12345678910 |
select null > 0;
– result:
null
select null < 0;
– result:
null
select null = 0;
– result:
null
select null =
null
;
– result:
null
select null !=
null
;
– result:
null
|
The correct way to compare a value to NULL is to use the IS keyword, and the is not operator:
123 |
select * from users where deleted_at is null ; – result: 所有被标记为删除的 users |
If you want to determine whether the values of the two columns are different, you can use the is distinct from:
123 |
select * from users where has_address is distinct from has_photo – result: 地址(address)或照片(photo)两者只有其一的用户 |
Not in and Null
Sub-query (subselect) is a convenient way to filter data. For example, if you want to query a user who does not have any packages, you can write a query such as the following:
12 |
select * from users where id not in ( select user_id from packages) |
But at this point, if the user_id of a row in the packages table is NULL, the problem is: The return result is empty! To understand why this weird thing happens, we need to understand what the SQL compiler is doing. The following is a simpler example:
12 |
select * from users where id not in (1, 2, null ) |
This SQL statement will be converted to:
12 |
select * from users where id != 1 and id != 2 and id != null |
We know that id! = NULL result is an unknown value, null. The result of an and operation of any value and null is NULL, so it is equivalent to no other condition. The reason for this result is that the logical value of NULL is not true.
If the condition is reversed, the query results are no problem. Now we query the users who have the package.
12 |
select * from users where id in ( select user_id from packages) |
Again we can use a simple example:
12 |
select * from users where id in (1, 2, null ) |
This SQL is converted to:
12 |
select * from users where id = 1 or id = 2 or id = null |
Because the WHERE clause is a string of or conditions, one of the results is null and irrelevant. The non-True (non-true) value does not affect the calculation of other parts of the clause, which is equivalent to being ignored.
Null and sort
When sorting, null values are considered to be the largest. When sorting in descending order (descending) This will make you very big, because the null value is first.
The following query is to show the rank of the user based on the score, but it ranks the user without scoring to the front!
1234 |
select name , points from users order by 2 desc ; – points 为 null 的记录排在所有记录之前! |
There are two ways to solve this kind of problem. The simplest is to use coalesce to eliminate the effect of NULL:
123456 |
– 在输出时将
null 转换为 0 :
select name
,
coalesce
(points, 0)
from users
order by 2
desc
;
– 输出时保留
null
, 但排序时转换为 0 :
select name
, points
from users
order by coalesce
(points, 0)
desc
;
|
There is also a way to support the database, specifying whether to place a null value on the front or the last face when sorting:
123 |
select name , coalesce (points, 0) from users order by 2 desc nulls last ; |
Of course, NULL can also be used to prevent errors, such as handling mathematical errors with a divisor of 0.
be removed by 0
A divisor of 0 is a very egg-painfull error. Yesterday also run a good SQL, suddenly by 0 except a sudden error. A common workaround is to use a case statement to determine if the denominator (denominator) is 0, and then divide the operation.
12 |
select case when num_users = 0 then 0 else total_sales/num_users end ; |
The way ASE statements are actually very difficult to see, and the denominator is reused. If this is a simple situation, if the denominator is a complex expression, then the tragedy comes: it's hard to read, it's hard to maintain and modify, and it's a bunch of bugs.
At this point we can look at the benefits of NULL. Use Nullif to make the denominator 0 o'clock null. This will no longer be an error, Num_users = 0 o'clock returns the result to null.
12 |
select total_sales/ nullif (num_users, 0); nullif 是将其他值转为 null , 而Oracle的 nvl 是将 null 转换为其他值。 |
If you do not want NULL, but want to convert to 0 or another number, you can use the COALESCE function on the basis of the previous SQL:
12 |
select coalesce (total_sales/ nullif (num_users, 0), 0); null 再转换回0 |
Conclusion
Tony Hoare may regret his mistake, but at least a null problem can easily be solved. Then go to practice the new big strokes, from the null dug out of the invalid Big Pit (nullifying)!
In-depth explanation of NULL in SQL