TSQL Checksum compare two tables for the same data

Source: Internet
Author: User

The checksum function is used to calculate the checksum of a set of expressions, which is int, and for the same set of expressions, the checksum is the same, and in rare cases, different expressions have the same checksum. This feature of the checksum function can be used to compare whether the data for two tables is the same. If the checksum values of the two columns in the table are the same, the values of the two columns are the same, so the checksum and the expression are a one-to-a relationship.

if object_id('Dbo.ta') is  not NULL    Drop TableDbo.taif object_id('DBO.TB') is  not NULL    Drop TableDBO.TBCreate TableDbo.ta (C1int, C2varchar(Ten))Create TableDBO.TB (B1int, B2varchar(Ten))Insert  intoDbo.taValues(1,'a'),(Ten,'A0'),( One,'A1'),( A,'A2'),( -,'A3'),( -,'A4')Insert  intoDbo.tbValues(1,'a'),(Ten,'A0'),( One,'A1'),( A,'A2'),( -,'A3'),( -,'A4'),( the,'A5')--Use join clause to compareSelect * fromDbo.ta a Left JoinDBO.TB b onChecksum (C1,C2)=Checksum (B1,B2)whereB.b1 is NULLSelect * fromDbo.ta a Right JoinDBO.TB b onChecksum (C1,C2)=Checksum (B1,B2)whereA.c1 is NULL--Use except clause to compareSelectChecksum (C1,C2) fromDbo.taexceptSelectChecksum (B1,B2) fromDBO.TB--The query resutl is 1779172094SelectChecksum (B1,B2) fromDBO.TBexceptSelectChecksum (C1,C2) fromDbo.ta--Use the value to querySelect *  fromDbo.tbwhereChecksum (B1,B2)=1779172094

MSDN Comments on Checksum

The syntax for the 1,checksum function is: CHECKSUM (expression [,... n])

2,note

CHECKSUM calculates a hash value called a checksum on its argument list. This hash value is used to generate a hash index. If the CHECKSUM parameter is a column and the index is generated for the computed CHECKSUM value, the result is a hash index. It can be used to perform an equivalent search on a column.

CHECKSUM satisfies the following properties of the hash function: When using the equals (=) operator comparison, if the corresponding elements of the two list have the same type and are equal, the CHECKSUM applied on any two expression lists will return the same value. For this definition, a Null value of the specified type is compared as equal. If a value in the expression list changes, the checksum of the list is usually changed. However, in rare cases, the checksum remains the same. Therefore, we do not recommend using CHECKSUM to detect if the value changes unless the application can tolerate occasional loss of changes. Please consider using hashbytes instead. When you specify the MD5 hashing algorithm, the likelihood that hashbytes returns the same result for two different inputs is much smaller than CHECKSUM.

The order of the expressions affects the result value of the CHECKSUM. The column order used for CHECKSUM (*) is the column order specified in the table or view definition. These include computed columns.

The CHECKSUM value depends on the collation, and the same values stored with different collations return a different CHECKSUM value.

3, use CHECKSUM to generate a hash index, generate a hash index by adding the computed checksum column to the indexed table, and then generating an index on the checksum column.

--Create a checksum indexALTER TABLEproduction.productADDCs_pname asCHECKSUM (Name);GOCREATE INDEXPname_index onproduction.product (cs_pname);GO--The checksum index can be used as a hash index, especially if the column to be indexed has a longer character column, which increases the index speed. The checksum index can be used for equivalent searches. /*Use the index in a SELECT query. ADD A second search condition to catch stray cases where checksums match, but the values is not the same.*/
SELECT * fromproduction.productWHERECHECKSUM (N'Bearing Ball')=Cs_pname andName=N'Bearing Ball';GO

Creating an index on a computed column is materialized as a checksum column, and any changes to the ProductName value are propagated to the checksum column. You can also generate indexes directly on the indexed columns. However, if the key value is longer, it is most likely that you do not perform a checksum index or even a regular index.

TSQL Checksum compare two tables for the same data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.