Two-point search with brute force lookups.
If possible, our test cases will demonstrate the necessity of the current algorithm by simulating the actual situation. This process is called whitelist filtering. Specifically, you can imagine a credit card company that needs to check if a client's trading account is valid. To do this, it requires:
- Keep the customer's account in a file, we call it a whitelist;
- Get the account number of each transaction from the standard input;
- Using this test case to print all accounts unrelated to any customer in standard output, the company is likely to reject such transactions.
In a large company with millions of customers, it is normal to deal with millions of or more transactions. To simulate this, we have documented LargeW.txt (1 million: About 6.7M) and largeT.txt (10 million strips: About 86M). Where LargeW.txt represents the whitelist, LargeT.txt represents the target file
Give a brute force lookup method (sequential lookup) to write a program bruteforcesearch, Compare it to your computer and binarysearch the time required to process largeW.txt (1 million: About 6.7M) and LargeT.txt (10 million: about 86M).
Description
The brute force lookup method mentioned in the fourth edition of the algorithm is a sequential lookup method, see Code:
1 Public Static int rank (intint[] a) {2for (int i = 0; i < A.length; i++) {3 ifreturn i; 4 }5 return -1; 6 }
Performance:
A program that is only available is often not enough. For example, the implementation of rank () above can also be very simple, it will check each element of the array, even do not need the array is ordered.
With this easy-to-understand solution, why do we need to merge sort and binary search? The computer uses the rank () method of brute force to handle large amounts of input (such as white lists with 1 million entries and 10 million trades) very slowly. without efficient algorithms such as binary lookup or merge sequencing, it is impossible to solve large-scale white-list problems. good performance is often of paramount importance.
Binary Lookup Algorithm code:
1 Public Static intRankintKeyint[] a) {2 intLo = 0;3 inthi = a.length-1;4 while(Lo <=hi) {5 //Key is in A[lo. Hi] or not present.6 intMid = lo + (Hi-lo)/2;7 if(Key < A[mid]) Hi = mid-1;8 Else if(Key > A[mid]) lo = mid + 1;9 Else returnmid;Ten } One return-1; A}
Experiment Code:
1PackageCom.beyond.algs4.experiment;23ImportJava.io.File;4ImportJava.util.Arrays;56ImportCom.beyond.algs4.lib.BinarySearch;7ImportCom.beyond.algs4.lib.StdIn;8ImportCom.beyond.algs4.lib.StdOut;9ImportCom.beyond.algs4.std.In;1011PublicClassPerfbruteforcesearch {1213/**14*@paramArgs15*/16PublicStaticvoidMain (string[] args) {-String Whitelist =Stdin.readstring ();18int[] Whitelistarray =ReadList (whitelist);String TargetList =Stdin.readstring ();20int[] Targetlistarray =ReadList (TargetList);21st22Long T1 =System.currenttimemillis ();23//for (int i = 0; i < targetlistarray.length; i++) {24//Bruteforcesearch.rank (Targetlistarray[i], whitelistarray);25//}26//Stdout.println (String.Format ("Bruteforcesearch in%d milliseconds", (long) (System.currenttimemillis ()-T1)));27//28//T1 = System.currenttimemillis ();29Arrays.sort (Whitelistarray);30for (int i = 0; i < targetlistarray.length; i++) {31Binarysearch.rank (Targetlistarray[i], whitelistarray);32}Stdout.println (String.Format ("BinarySearch in%d milliseconds", (Long) (System.currenttimemillis ()-(t1)));34}3536 private static int[] readlist (String whitelist) {37 File fwhitelist = Span style= "color: #0000ff;" >new File (whitelist); new "(fwhitelist) int["Whitelistarray = in.readallints (); 40 return Whitelistarray; }42 43}
Experimental results:
1) TinyW.txt and TinyT.txt
./tinyw.txt
./tinyt.txt
Bruteforcesearch in 0 milliseconds
BinarySearch in 1 milliseconds
2) LargeW.txt and LargeT.txt (Bruteforcesearch in hours)
./largew.txt
./larget.txt
BinarySearch in 2399 milliseconds
Additional notes:
Experimental method ignores the effect of reading the test data file on the algorithm
The experimental method ignores the effect of bruteforcesearch and BinarySearch execution
Attention
Error may occur when memory is insufficient: Java heap Space
1./LargeW.txt2./LargeT.txt3Exception in thread "main"Java.lang.OutOfMemoryError:Java Heap Space4At java.nio.heapcharbuffer.<init>(Unknown Source)5 At java.nio.CharBuffer.allocate (Unknown Source)6 At java.util.Scanner.makeSpace (Unknown Source)7 At java.util.Scanner.readInput (Unknown Source)8 At Java.util.Scanner.next (Unknown Source)9At Com.beyond.algs4.std.In.readAll (in.java:247)TenAt Com.beyond.algs4.std.In.readAllStrings (in.java:322) OneAt Com.beyond.algs4.std.In.readAllInts (in.java:348) AAt Com.beyond.algs4.experiment.PerfBruteForceSearch.readlist (perfbruteforcesearch.java:39) -At Com.beyond.algs4.experiment.PerfBruteForceSearch.main (perfbruteforcesearch.java:20)
Eclipse will make an error "Failed to create the Java Virtualmachine" When changing win32_x86 version of Eclipse and updating Eclipse.ini to-xmx2048m more than 1024. It is recommended that you replace the 64-bit version.
Basic Computer Configuration
Processor: Inter (R) Pentium (r) CPU G3220 @3.00ghz
Memory: 8.00GB
System type: 64-bit operating system
Software Environment
IDE:Version:Mars Release (4.5.0)
Jvm:
1-Startup2plugins/org.eclipse.equinox.launcher_1.3.100.v20150511-1540. Jar3--launcher.library4plugins/org.eclipse.equinox.launcher.win32.win32.x86_64_1.1.300.v20150602-14175-Product6Org.eclipse.epp. Package. Jee.product7--launcher.defaultaction8 OpenFile9--launcher. XxmaxpermsizeTen 256M One-Showsplash A Org.eclipse.platform ---launcher. Xxmaxpermsize - 256m the--launcher.defaultaction - OpenFile ---Launcher.appendvmargs --Vmargs +-dosgi.requiredjavaversion=1.7 --xms512m +-xmx2048m
Resources:
Algorithm fourth edition She Luyun algorithms Fourth Edition [US] Robert Sedgewick, Kevin Wayne
http://algs4.cs.princeton.edu/home/
SOURCE Download Link:
Http://pan.baidu.com/s/1eQlhUt8
"Algorithm" binary search with brute force search (Whitelist filter)