10 Java Developers Common errors when writing SQL statements

10 Java Developers Common errors when writing SQL statements _mssql

Last Update:2017-01-18 Source: Internet

Author: User

Tags bulk insert

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The degree to which Java developers can coordinate object-oriented programming thinking with command-line programming depends on the level of ability of the following:

Tip (Anyone can write code in command-line form)
Dogma (Some people use the "mode-mode" approach, that is, patterns are ubiquitous and identified by name)
Emotional state (in the early days, real object-oriented code is more difficult to understand than imperative code.) )

However, when Java developers write SQL statements, everything becomes different. SQL is a descriptive language that has nothing to do with object-oriented thinking and imperative thinking. In the SQL language, queries are very easy to express. But it is not so easy to write in the best or most correct way. Developers not only need to rethink their programming patterns, they also need to think from the perspective of set theory.

Here are a few common errors that Java developers make when they write a JDBC or JOOQ SQL statement:

1. Forget about NULL

Misunderstanding the meaning of NULL may be the most common mistake Java developers make in writing SQL. This may be because NULL is also known as unknown, but there are other reasons. Of course, it would be easier to understand if it was only called unknown. Another reason is that when JDBC gets data, or binds a variable, NULL in SQL is mapped to NULL in Java. This may lead to the belief that there are null==null in Java, as well as null= NULL in SQL.

A more bizarre misconception is the example of NULL when the NULL predicate is used for row-value expressions.

Another subtle problem arises from the misunderstanding of the null implication in the Notin-back connection.

Solutions

Keep on training yourself. To always be clear about the meaning of NULL, every time you write SQL, consider:

is the predicate correct for null?
Does null affect the result of this function?
2. Working with data in Java memory

Some Java developers know a lot about SQL features. Occasionally join, scattered union, no problem. But what if you encounter the Windows feature, the result set grouping, and so on? Many Java developers load SQL data into memory, convert the data into appropriate collection types, and perform annoying math on the collection with a lengthy loop structure (at least before Java 8 improves the container).

But some SQL databases, in addition to supporting SQL standards, also support advanced OLAP features, perform more efficiently, and are easier to write. A nonstandard example is the model clause of Oracle. Simply let the database process the data and load the resulting results into Java memory. Because some very smart people have optimized these expensive products. So, in fact, by migrating to an OLAP database, you get two benefits:

Simple. It may make it easier to write the correct code in SQL than in Java
Performance. The database will probably be faster than your algorithm. More importantly, you don't have to transfer millions of of records over the network.
Solutions

Every time you implement a data-centric algorithm in Java, try asking yourself: Is there a way to get the database to do the work and just deliver the results to me?

3. Try to use Union instead of UNION all

Relative to union,union all need extra keywords to look pale. It would be much better if the following support was defined in the SQL standard:

UNION (Allow duplicates)
UNION DISTINCT (remove repetition)
There is generally little need to remove duplicates (sometimes going heavy or even wrong), and for large result sets with many columns, it is often slow because the two subqueries need to be sorted, and each tuple needs to be compared to the subsequent tuples.

It is worth noting that even though the SQL standard specifies Intersectall and exceptall, there are few databases that implement these very useful operations.

Solutions

Every time you write a union, consider whether you actually want to write UnionAll.

4. Use the JDBC paging feature to page out a large number of results

Most databases are supported via limit. Offset,top.. START at, OFFSET ... The FETCH clause, in some way, pagination the results. With no support for these clauses, there are still rownum (Oracle) or row_number () over (Db2,sql Server 2008 and lower versions), which is much faster than paging in memory. And this is more obvious for large datasets.

Solutions

As long as you use those clauses or tools (such as Jooq), you can simulate the above paging clause for you.

5. Connect in Java Memory

From the beginning of the development of SQL, some developers still have an uneasy feeling when they face the SQL connection. There has always been an inherent fear---the slow join speed. If the cost-based optimizer chooses to execute a nested loop, it is really slow to load the full table into the database memory before creating a connection table source. But very little happens. The Mergejoin and hashjoin operations are very fast through appropriate predicates, constraints, and indexes. This is related to the correct metadata (I don't have to mention the example of Tom Kyte). However, there may still be a lot of Java developers who want to load two of tables into the map container from a separate query, and connect in some way in Java memory.

Solutions

If you have a select operation from multiple tables in multiple steps, consider carefully whether you can express the query functionality you need in a single statement.
6. Remove duplicates from a Cartesian product using distinct or union

The existence of a lengthy connection can lead to a very loose relationship in the SQL statement. Specifically, if you are involved in a multiple-column foreign key relationship, it is very likely that you forget to add predicates on the Joinon clause. This may result in duplicate records, but perhaps only in exceptional cases. Then some developers may choose to use distinct to delete these duplicate records again. This error has three kinds of harms:

It may not cure the symptoms. Even under certain marginal conditions, the label cannot be cured.
This can be very slow on large result sets with many columns. Distinct will perform an order by operation to remove duplicates.
This is also very slow in large Cartesian product, as doing so will still cause large amounts of data to be loaded in memory.
Solutions

As a rule of thumb, when you get unwanted duplicate results, you should first check your connection predicates. Because there is a possibility that there is an imperceptible cartesian product somewhere.

7. Do not use the merge statement

Strictly speaking, this is not a real mistake, it may just be a lack of awareness or fear of a powerful merge statement. Some databases include other forms of upsert statements, such as the MySQL onduplicate KEY update clause. But the merge is really powerful, and most importantly in the database, it expands the SQL standard to a large extent, such as SQL Server.

Solutions

If you are using the link insert and update or link Select ... For update to implement Upserting, then you have to think more. Aside from the risk of running conditions, you may be able to use a simple merge statement to achieve the goal.

8. Aggregate functions are used instead of form functions

Before you introduce a window function, using the GROUPBY clause and the projection aggregate function is the only way to summarize the data. This is very effective in most cases, and if the aggregated data needs to be supplemented by regular data, the grouped queries can be placed in the connection subquery.

However, sql:2003 defines the window function, and many mainstream database vendors have realized the window function at present. Window features can aggregate data that is not grouped in the result set. In fact, the functionality of each window supports its own independent Partitionby clause, which is a useful tool for reporting class applications.

Using the Window feature will:

Results in more readable SQL (reduced presence of non-private GROUP BY clauses in subqueries)
Improve performance, as an RDBMS is likely to be easier to optimize its window functionality.
Solutions

When you write a groupby clause in a subquery, think carefully whether it can be done with a window function.

9. Using the memory sort method for indirect ordering

The Sqlorder by clause supports multiple types of expressions, including case statements, which are useful for indirect ordering. You should always be able to sort the data in Java memory, because you think:

SQL sort too slow
SQL sort can't do that
Solutions

If you are sorting any SQL data in memory, think about whether you can migrate the sort to the database. This is the same as the reason for migrating paging to a database.

Insert a large number of records 101 after one

JDBC contains batches, and you should use it. In the face of thousands of records, do not create a new preparedstatement for each record to insert operations. If you want to insert all the records into the same table, use a single SQL statement and multiple bound value sets to create a batch insert statement. Depending on your database and database configuration, you may need to commit after a certain number of inserted records in order to keep the undo log too large.

Solutions

Always BULK insert large data sets.

Java developers write SQL statements Common 10 kinds of errors, we are not have a general understanding, hope that in the process of writing must pay special attention to!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More