Uniq methods for JavaScript arrays _javascript tips

Source: Internet
Author: User
Adds a prototype method to the array local object, which is used to delete the duplicate entries in the array entry (there may be multiple), and the return value is a new array that contains the duplicate entries that were deleted.

Formal Description:
Input
Array (Size=n)
Output
A subset of Array1=array without repeated guarantees,
No repetition means that the arbitrary a,b belong to Array1,a!=b
Pao means that if a is lower than B in array, a subscript in the Array1 is less than B in the subscript of the array
Array2=array-array1, guaranteed order
Realazy A new solution, the idea is very clear: sequential traversal access to each element, if the value of this element has been visited, then add Array2, otherwise join the Array1. The method used to determine whether the value of the current element has been accessed is sequentially traversing all the elements that have been accessed.
It is easy to see the complexity of the algorithm is about O (n^2).

I have made some improvements in his algorithm framework, and the key is how to determine whether the value of the current element has been accessed during traversal. Under the condition that the range of the original array is positive integer and the extreme difference (Range=max value-min value) is not too large, a simple "bucket" algorithm can be used.
Prepares a Boolean array B with a range of length, with initialization all false. For each value in the original array, if b[value]=true, it indicates that the value has been accessed, placed in the Array2, or placed in the Array1 at the same time b[value]=true.
This is obviously an O (N) algorithm, at the cost of an additional space complexity range, and requires a positive integer for the original array value.
It is not difficult to generalize to a case where the range is an integer, in fact, only by examining the bucket number value-min (Array) can be converted to a positive integer.

In order to avoid the waste of space caused by range too large, the algorithm of "bucket" is improved to hash algorithm, and the linear congruence open hashing method is more specifically. The aim is to map the range compression to a manageable subset of positive integers, while ensuring that the probability of the same image corresponding to the different original images is as small as possible, that is to say, the bucket and bucket should be as load balanced as possible.
For example, this is a hash function with a range of real numbers:
Key=hashfun (value) =math.floor (value) *37%91
This is still an O (n) algorithm, (obviously O (n) is the lower bound of all uniq algorithms), the advantage of which is that it can control the cost of the space, and can adapt to the field of the non integer value, only need to design the corresponding hash function.



The following is the implementation of the bucket (bucket) algorithm:
var Resultarr = [],
Returnarr = [],
Origlen = This.length,
Resultlen;
var maxv=this[0],minv=this[0];
for (var i=1; i<origlen; ++i) {
if (THIS[I]&GT;MAXV) maxv=this[i];
else if (THIS[I]&LT;MINV) minv=this[i];
}
var blen=maxv-minv+1;
var b=new Array (Blen);
for (var i=0;i<blen;++i) B[i]=false;
for (var i=0; i<origlen; ++i) {
if (B[this[i]-minv]) {
Returnarr.push (This[i]);
} else {
Resultarr.push (This[i]);
B[this[i]-minv]=true;
}
}
Resultlen = Resultarr.length;
This.length = Resultlen;
for (var i=0; i<resultlen; ++i) {
This[i] = Resultarr[i];
}
return Returnarr;
The following is the implementation of the hash (hash) algorithm
var Shuffler = 37
var beta=0.007;
var origlen=this.length
var bucketsize=math.ceil (Origlen*beta);
var hashset=new Array (bucketsize);
var hashfun = function (value) {
var key = (Math.floor (value) *shuffler)%bucketsize;
Return key;
}
Init hashset
for (Var i=0;i<bucketsize;i++) hashset[i]=new Array ();
//
var ret=[],self=[];
var Key,value;
var Bucket,openlen;
var everconflict;
for (Var i=0;i<origlen;i++) {
Value=this[i];
Key=hashfun (value);
bucket = Hashset[key];
openlen=bucket.length;//if (openlen>1) return;
Everconflict=false;
for (Var j=0;j<openlen;j++) {
if (Bucket[j]==value) {
Ret.push (value);
Everconflict=true;
Break
}
}
if (!everconflict) {
Bucket.push (value);
Self.push (value);
}
}
Selflen = Self.length;
This.length = Selflen;
For (i=0 i<selflen; ++i) {
This[i] = Self[i];
}
Compute average bucket size
var lens=[],sum=0;
for (var i=0;iAverage=sum/hashset.length;//watch Lens,average
return ret;


Random integer test calculation time with k*10000 0~k*100 (MS)
K 1 2 3 4 5
Realazy 240 693 1399 2301 3807
Bucket 55 101 141 219 293
Hash 214 411 654 844 1083
The test framework draws on the http://realazy.org/lab/uniq.html
Test environment FIREFOX2.0.0.6/UBUNTU7.10/2.66GHZP4/1024MBDDR

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.