"Algorithm" binary search with brute force search (Whitelist filter)

Source: Internet
Author: User

Two-point search with brute force lookups.

If possible, our test cases will demonstrate the necessity of the current algorithm by simulating the actual situation. This process is called whitelist filtering. Specifically, you can imagine a credit card company that needs to check if a client's trading account is valid. To do this, it requires:

    • Keep the customer's account in a file, we call it a whitelist;
    • Get the account number of each transaction from the standard input;
    • Using this test case to print all accounts unrelated to any customer in standard output, the company is likely to reject such transactions.

In a large company with millions of customers, it is normal to deal with millions of or more transactions. To simulate this, we have documented LargeW.txt (1 million: About 6.7M) and largeT.txt (10 million strips: About 86M). Where LargeW.txt represents the whitelist, LargeT.txt represents the target file

Give a brute force lookup method (sequential lookup) to write a program bruteforcesearch, Compare it to your computer and binarysearch the time required to process largeW.txt (1 million: About 6.7M) and LargeT.txt (10 million: about 86M).

Description

The brute force lookup method mentioned in the fourth edition of the algorithm is a sequential lookup method, see Code:

1      Public Static int rank (intint[] a) {2for          (int i = 0; i < A.length; i++) {3             ifreturn  i; 4         }5         return -1; 6     }


Performance:

A program that is only available is often not enough. For example, the implementation of rank () above can also be very simple, it will check each element of the array, even do not need the array is ordered.

With this easy-to-understand solution, why do we need to merge sort and binary search? The computer uses the rank () method of brute force to handle large amounts of input (such as white lists with 1 million entries and 10 million trades) very slowly. without efficient algorithms such as binary lookup or merge sequencing, it is impossible to solve large-scale white-list problems. good performance is often of paramount importance.

Binary Lookup Algorithm code:

1     Public Static intRankintKeyint[] a) {2         intLo = 0;3         inthi = a.length-1;4          while(Lo <=hi) {5             //Key is in A[lo. Hi] or not present.6             intMid = lo + (Hi-lo)/2;7             if(Key < A[mid]) Hi = mid-1;8             Else if(Key > A[mid]) lo = mid + 1;9             Else returnmid;Ten         } One         return-1; A}


Experiment Code:

1PackageCom.beyond.algs4.experiment;23ImportJava.io.File;4ImportJava.util.Arrays;56ImportCom.beyond.algs4.lib.BinarySearch;7ImportCom.beyond.algs4.lib.StdIn;8ImportCom.beyond.algs4.lib.StdOut;9ImportCom.beyond.algs4.std.In;1011PublicClassPerfbruteforcesearch {1213/**14*@paramArgs15*/16PublicStaticvoidMain (string[] args) {-String Whitelist =Stdin.readstring ();18int[] Whitelistarray =ReadList (whitelist);String TargetList =Stdin.readstring ();20int[] Targetlistarray =ReadList (TargetList);21st22Long T1 =System.currenttimemillis ();23//for (int i = 0; i < targetlistarray.length; i++) {24//Bruteforcesearch.rank (Targetlistarray[i], whitelistarray);25//}26//Stdout.println (String.Format ("Bruteforcesearch in%d milliseconds", (long) (System.currenttimemillis ()-T1)));27//28//T1 = System.currenttimemillis ();29Arrays.sort (Whitelistarray);30for (int i = 0; i < targetlistarray.length; i++) {31Binarysearch.rank (Targetlistarray[i], whitelistarray);32}Stdout.println (String.Format ("BinarySearch in%d milliseconds", (Long) (System.currenttimemillis ()-(t1)));34}3536 private static int[] readlist (String whitelist) {37 File fwhitelist = Span style= "color: #0000ff;" >new File (whitelist); new "(fwhitelist) int["Whitelistarray = in.readallints (); 40 return Whitelistarray; }42 43}   

Experimental results:

1) TinyW.txt and TinyT.txt

./tinyw.txt
./tinyt.txt
Bruteforcesearch in 0 milliseconds
BinarySearch in 1 milliseconds

2) LargeW.txt and LargeT.txt (Bruteforcesearch in hours)

./largew.txt
./larget.txt
BinarySearch in 2399 milliseconds

Additional notes:

Experimental method ignores the effect of reading the test data file on the algorithm

The experimental method ignores the effect of bruteforcesearch and BinarySearch execution

Attention

Error may occur when memory is insufficient: Java heap Space

1./LargeW.txt2./LargeT.txt3Exception in thread "main"Java.lang.OutOfMemoryError:Java Heap Space4At java.nio.heapcharbuffer.<init>(Unknown Source)5 At java.nio.CharBuffer.allocate (Unknown Source)6 At java.util.Scanner.makeSpace (Unknown Source)7 At java.util.Scanner.readInput (Unknown Source)8 At Java.util.Scanner.next (Unknown Source)9At Com.beyond.algs4.std.In.readAll (in.java:247)TenAt Com.beyond.algs4.std.In.readAllStrings (in.java:322) OneAt Com.beyond.algs4.std.In.readAllInts (in.java:348) AAt Com.beyond.algs4.experiment.PerfBruteForceSearch.readlist (perfbruteforcesearch.java:39) -At Com.beyond.algs4.experiment.PerfBruteForceSearch.main (perfbruteforcesearch.java:20)

Eclipse will make an error "Failed to create the Java Virtualmachine" When changing win32_x86 version of Eclipse and updating Eclipse.ini to-xmx2048m more than 1024. It is recommended that you replace the 64-bit version.

Basic Computer Configuration

Processor: Inter (R) Pentium (r) CPU G3220 @3.00ghz

Memory: 8.00GB

System type: 64-bit operating system

Software Environment

IDE:Version:Mars Release (4.5.0)

Jvm:

1-Startup2plugins/org.eclipse.equinox.launcher_1.3.100.v20150511-1540. Jar3--launcher.library4plugins/org.eclipse.equinox.launcher.win32.win32.x86_64_1.1.300.v20150602-14175-Product6Org.eclipse.epp. Package. Jee.product7--launcher.defaultaction8 OpenFile9--launcher. XxmaxpermsizeTen 256M One-Showsplash A Org.eclipse.platform ---launcher. Xxmaxpermsize - 256m the--launcher.defaultaction - OpenFile ---Launcher.appendvmargs --Vmargs +-dosgi.requiredjavaversion=1.7 --xms512m +-xmx2048m

Resources:

Algorithm fourth edition She Luyun algorithms Fourth Edition [US] Robert Sedgewick, Kevin Wayne

http://algs4.cs.princeton.edu/home/

SOURCE Download Link:

Http://pan.baidu.com/s/1eQlhUt8

"Algorithm" binary search with brute force search (Whitelist filter)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.