[Original] proof of correctness of the scanning algorithm for finding the intersection of two sorted Arrays

Source: Internet
Author: User

When reading "Introduction to Information Retrieval", I saw the implementation of this algorithm. It is used to demonstrate how to calculate the intersection of the inverted list of two terms. The pseudocode is as follows:

Intersect(P1, P2)

1 answer limit {}

2WhileP1! = NilAndP2! = NilDo

3IfDocid (P1) = docid (P2)Then

4Add(Answer, doci D (P1 ))

5 P1 then next (P1)

6 P2 then next (P2)

7Else ifDocid (P1) <docid (P2)Then

8 P1 then next (P1)

9ElseP2 then next (P2)

10ReturnAnswer

At first glance, this code is similar to the merge process of merging and sorting, but it only takes the elements of the two arrays. Here, the array is sorted in ascending order without duplicates. In fact, Arrays can also be linked lists. This article uses arrays as an example. The same is true for linked lists.

Its correctness seems to be okay, but if you want to strictly describe its correctness, you may feel a little scratching your hair. There is no very intuitive way to explain its correctness in one sentence. After thinking for a while, you still have to make some definitions.

First, the first definition is given, which is also a loop without variation:

At the beginning of each iteration of the loop, the current element a [I] of array A and the current Element B [J] of array B meet the following conditions:

A [I]> B [1]... B [J-1]

B [J]> A [1]... a [I-1]

To enable the algorithm to meet this condition at the beginning, you can add the positive infinity element to the front of each array. Mathematical induction can be used to prove the non-variant cycle. The specific process is omitted.

Next, we will further describe the objectives of this algorithm:

What we need to do in this algorithm is to find the smallest element larger than or equal to K in the corresponding ordered array for an element K in each array. Because the arrays are ordered, it is actually the first element in another array that traverses from the first element and is greater than or equal to K.

When we find the first element in the other array that is equal to or greater than K, we actually know whether the other array has an element equal to K. If this element is found to be greater than K, you can know that there are no equal elements in the other array (it can be proved by the ordered nature of the array), so you can skip the K element. If they are equal, they are found. You can jump from the two arrays to the next element and continue searching.

Based on the above ideas, it is not difficult to understand the backbone of the algorithm. Here, the element K is not fixed. Because the two arrays are completely symmetric, the relationship is mutual. Therefore, the K skipped each time is the smaller of the current element of the two arrays. The smaller one can be considered as finding the smallest element in the other array that is equal to itself (that is, the current element in the other array, which can be self-proved ). The larger one cannot be said, because there may be smaller elements behind the other array.

Well, the correctness of the algorithm is basically white.

(For convenience, please understand the above description)

[Original] proof of correctness of the scanning algorithm for finding the intersection of two sorted Arrays

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.