Uniq method of JavaScript Array

Source: Internet
Author: User
Tags javascript array

Add a prototype method to the local object of array. Its purpose is to delete repeated entries in array entries (multiple entries may exist). The returned value is a new array containing the deleted repeated entries.

Formal Description:
Input
Array (size = N)
Output
Array1 = a subset of array without repeated order preserving,
Non-repetition means that, for any A, B belongs to array1,! = B
Ordering means that if the subscript of A in array is smaller than that of B in array, the subscript of A in array1 is smaller than that of B in array.
Array2 = Array-Array1, ordering
Realazy provides a new solution with a clear idea: traverse each element sequentially. If the value of this element has already been accessed, add array2; otherwise, add array1. The method used to determine whether the value of the current element has been accessed is to traverse all elements that have been accessed in sequence.
Easy to seeAlgorithmThe complexity is about O (N ^ 2 ).

I have made some improvements in his algorithm framework. The key is how to determine whether the value of the current element has been accessed during the traversal process. A simple "Bucket" algorithm can be used when the original array value is a positive integer and the range = max value-min value is not very large.
Prepare a Boolean array B with a length of range. The value of initialization is false. For each value in the original array, if B [value] = true, it indicates that this value has been accessed and put into array2; otherwise, if it is put into array1, B [value] = true.
This is obviously an O (n) algorithm. The cost is extra space complexity range, and the value range of the original array must be a positive integer.
It is not difficult to generalize to the case where the value field is an integer. In fact, you only need to evaluate the bucket value-min (array) to convert it to a positive integer.

To avoid space waste caused by a large range, the hash algorithm is improved based on the "Bucket" algorithm. Specifically, the linear co-exclusive hashing method is used. The purpose is to compress the value range to a controllable small continuous positive integer subset, and ensure that the probability of the same image corresponding to different original images is as small as possible, in other words, load balancing should be performed between buckets as much as possible.
For example, this is a real-number hash function:
Key = hashfun (value) = math. Floor (value) * 37% 91
This is still the O (n) algorithm (obviously O (n) is the lower bound to the complexity of all uniq algorithms). The advantage is that it can control the space overhead and adapt to non-integer fields, you only need to design the corresponding hash function.

The following describes how to implement the Bucket Algorithm:
VaR resultarr = [],
Returnarr = [],
Origlen = This. length,
Resultlen;
VaR maxv = This [0], MINV = This [0];
For (VAR I = 1; I <origlen; ++ I ){
If (this [I]> maxv) maxv = This [I];
Else if (this [I] <MINV) MINV = This [I];
}
VaR blen = maxv-MINV + 1;
VaR B = new array (blen );
For (VAR I = 0; I <blen; ++ I) B [I] = false;
For (VAR I = 0; I <origlen; ++ I ){
If (B [this [I]-MINV]) {
Returnarr. Push (this [I]);
} Else {
Resultarr. Push (this [I]);
B [this [I]-MINV] = true;
}
}
Resultlen = resultarr. length;
This. Length = resultlen;
For (VAR I = 0; I <resultlen; ++ I ){
This [I] = resultarr [I];
}
Return returnarr;
The implementation of the hash algorithm is as follows:
VaR shuffler = 37
VaR Beta = 0.007;
VaR origlen = This. Length
VaR bucketsize = math. Ceil (origlen * beta );
VaR hashset = new array (bucketsize );
VaR hashfun = function (value ){
Var key = (math. Floor (value) * shuffler) % bucketsize;
Return key;
}
// Init hashset
For (VAR I = 0; I <bucketsize; I ++) hashset [I] = new array ();
//
VaR ret = [], self = [];
Var key, value;
VaR bucket, openlen;
VaR everconflict;
For (VAR I = 0; I <origlen; I ++ ){
Value = This [I];
Key = hashfun (value );
Bucket = hashset [Key];
Openlen = bucket. length; // If (openlen> 1) return;
Everconflict = false;
For (var j = 0; j <openlen; j ++ ){
If (bucket [J] = value ){
Ret. Push (value );
Everconflict = true;
Break;
}
}
If (! Everconflict ){
Bucket. Push (value );
Self. Push (value );
}
}
Selflen = self. length;
This. Length = selflen;
For (I = 0; I <selflen; ++ I ){
This [I] = self [I];
}
// Compute average bucket size
VaR lens = [], sum = 0;
For (VAR I = 0; I Average = sum/hashset. length; // watch lens, average
Return ret;

Use K * 10000 0 ~ K * 100 random integer test calculation time (MS)
K 1 2 3 4 5
Realazy 240 693 1399 2301 3807
Bucket 55 101 141 219 293
Hash 214 411 654 844 1083
The test framework draws on the http://realazy.org/lab/uniq.html
Test environment: firefox2.0.0.6/ubuntu7.10/2.66ghzp4/1024 mbddr

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.