Crazy Java algorithms-insert sorting, Merge Sorting, and Parallel Merge Sorting

Source: Internet
Author: User

Crazy Java algorithms-insert sorting, Merge Sorting, and Parallel Merge Sorting
Since ancient times, there has been a difficult question in the IT session, commonly known as sorting. Whether you take a high-end test such as BAT or a grassroots test hidden in the streets, you will often see such a problem that is hard to solve for a hundred years. Today, LZ has the honor to share with you the grassroots stars of the algorithm Session, the leader of the sorting session-insertion sorting and Merge Sorting. Finally, in the brainstorm, LZ had the honor to meet a new friend named Parallel Merge Sorting. Next, let's get to know each other and come to the "computing forest conference" at the last time. Insert sort Introduction insert sort, which is the most favored sort algorithm by forest. Insert sort sorts an integer array in the simplest insert method. It loops through all the elements in the array starting from the second and inserts each element in the loop to the corresponding position to achieve sorting. The insert sort code is displayed. You can use the following code to describe the insert sort. Package algorithm;/*** @ author zuoxiaolong ***/public abstract class InsertSort {public static void sort (int [] numbers) {for (int I = 1; I <numbers. length; I ++) {int currentNumber = numbers [I]; int j = I-1; while (j> = 0 & numbers [j]> currentNumber) {numbers [j + 1] = numbers [j]; j --;} numbers [j + 1] = currentNumber ;}}} copy the Code. This algorithm starts from the second element of the array and compares the selected element with the previous one. If the selected element is smaller than the previous one, the previous element is moved back, most Then place the selected element in the appropriate position. During the execution of this algorithm, the arrays before index I are always sorted in ascending order. Insertion sorting is easy to understand, so LZ does not have much to explain its implementation principles. yuanyou can study it on its own. Next, let's analyze the performance of insert sorting. First, there are two loops in the insert sorting. Assume that the array size is n, the first loop is n-1, and the second while loop is 1 to n-1 in the worst case. Therefore, the time complexity of insertion sorting is about the following format. 1 + 2 + 3 + 4 +... + n-1 = n (n-1)/2 = O (n2) the time complexity is two functions of the input scale. It can be seen that the time complexity of insertion sorting is relatively high. This is a simple analysis of the principle. At the end of the "computing forest Conference", you can clearly see that the insertion sorting will increase exponentially as the input size increases. Merge sort overview merge sort, leading to the trend of division and Control Law. Merge Sorting splits the sorting problem, for example, by dividing it into two smaller arrays, sorting the split arrays separately, and then merging the smaller arrays after sorting. This idea is an algorithm design idea. Many problems can be solved in this way. Ing to the programming field is actually a recursive idea. Therefore, recursive calls will occur in the Merge Sorting Algorithm. The code for merging and sorting shows that merging and sorting mainly consists of two methods: one is to merge two sorted arrays, and the other is a recursive method, which is used to infinitely split the problem. Next let's take a look at the Java code display of Merge Sorting, as shown below. Package algorithm;/*** @ author zuoxiaolong ***/public abstract class MergeSort {public static void sort (int [] numbers) {sort (numbers, 0, numbers. length);} public static void sort (int [] numbers, int pos, int end) {if (end-pos)> 1) {int offset = (end + pos) /2; sort (numbers, pos, offset); sort (numbers, offset, end); merge (numbers, pos, offset, end );}} public static void merge (int [] numbers, Int pos, int offset, int end) {int [] array1 = new int [offset-pos]; int [] array2 = new int [end-offset]; System. arraycopy (numbers, pos, array1, 0, array1.length); System. arraycopy (numbers, offset, array2, 0, array2.length); for (int I = pos, j = 0, k = 0; I <end; I ++) {if (j = array1.length) {System. arraycopy (array2, k, numbers, I, array2.length-k); break;} if (k = array2.length) {System. arra Ycopy (array1, j, numbers, I, array1.length-j); break;} if (array1 [j] <= array2 [k]) {numbers [I] = array1 [j ++];} else {numbers [I] = array2 [k ++, merge Sorting divides an array with n length into two n/2 Arrays for processing. Therefore, the sort method itself is called twice. When the array size is 1, it is considered that the array is already sorted. Therefore, in the sort method, when the difference between end and pos is greater than 2, further splitting is required, which is also the condition for Recursive termination. In addition, in the Code, the arraycory function provided by Java is used for Array replication. This method of directly copying the memory area will be faster than the method of cyclic assignment. Some algorithm implementations will set a sentinel for the two temporary arrays in the merge method to prevent the first two if judgments of the for Loop in merge. For ease of understanding, LZ does not set a sentry here. When the elements of an array are exhausted, the arraycopy method is used to copy another array to numbers. The Performance Analysis of Merge Sorting is the same as that of insert sorting. Let's analyze the time complexity of Merge Sorting. Let's assume that the array size is n, and the time complexity of the sort method is f (end-pos ). Simply analyzing the complexity of the merge method, it is not difficult to find it as (end-pos) * 2. The premise of this result is that we consider the complexity of the arraycopy method as the length parameter. Based on the above assumptions, because the initial value of end-pos is n, the complexity of Merge Sorting is about the following form. 2 * f (n/2) + 2 * n = 2*(2 * f (n/4) + 2*(n/2 )) + 2 * n = 4 * f (n/4) + 2 * n + 2 * n = n * f (1) + 2 * n +... + 2 * n where the time complexity of f (1) is a constant, assuming f (1) = c, and 2 * n will have log2n. Therefore, the final time complexity of Merge Sorting is as follows. The time complexity of cn + 2n * log2n = O (n * log2n) Merge Sorting is much lower than that of insert sorting, this is obvious when the input size of the array is large, because the increase speed of log functions is much lower than that of n. Introduction to Parallel Merge Sorting is the intention of LZ when learning to merge and sort. Recently, LZ is studying Java's concurrent programming. The subproblems that happen to merge and sort have a certain degree of parallelism and independence, therefore, the concurrent Merge Sorting of LZ was born. Afterwards, LZ also learned how to merge and sort people in parallel and found that it was already well known. However, if I don't know, I can think of whether or not I should have liked it. Parallel Merge Sorting is no different from general Merge Sorting. It only utilizes the advantages of multi-core computers to handle two or more sub-problems when possible. In this way, in terms of efficiency, Parallel Merge Sorting will be better than Merge Sorting. Code presentation of Parallel merge Sorting mainly modifies the sort method. The basic merge method is the same as the general merge Sorting method. Therefore, some Merge Sorting methods are referenced during Parallel Merge Sorting. The specific code is as follows. Package algorithm; import java. util. concurrent. countDownLatch;/*** @ author zuoxiaolong ***/public abstract class MergeParallelSort {private static final int maxAsynDepth = (int) (Math. log (Runtime. getRuntime (). availableProcessors ()/Math. log (2); public static void sort (int [] numbers) {sort (numbers, maxAsynDepth);} public static void sort (int [] numbers, Integer asynDepth) {sortParallel (numbers, 0, numbers. length, asynDepth> maxAsynDepth? MaxAsynDepth: asynDepth, 1);} public static void sortParallel (final int [] numbers, final int pos, final int end, final int asynDepth, final int depth) {if (end-pos)> 1) {final CountDownLatch mergeSignal = new CountDownLatch (2); final int offset = (end + pos)/2; thread thread1 = new SortThread (depth, asynDepth, numbers, mergeSignal, pos, offset); Thread thread2 = new SortThread (depth, asynDept H, numbers, mergeSignal, offset, end); thread1.start (); thread2.start (); try {mergeSignal. await ();} catch (InterruptedException e) {} MergeSort. merge (numbers, pos, offset, end) ;}} static class SortThread extends Thread {private int depth; private int asynDepth; private int [] numbers; private CountDownLatch mergeSignal; private int pos; private int end;/*** @ param depth * @ param asynDepth * @ Param numbers * @ param mergeSignal * @ param pos * @ param end */public SortThread (int depth, int asynDepth, int [] numbers, CountDownLatch mergeSignal, int pos, int end) {super (); this. depth = depth; this. asynDepth = asynDepth; this. numbers = numbers; this. mergeSignal = mergeSignal; this. pos = pos; this. end = end ;}@ Override public void run () {if (depth <asynDepth) {sortParallel (numbers, pos, e Nd, asynDepth, (depth + 1);} else {MergeSort. sort (numbers, pos, end);} mergeSignal. countDown () ;}} in this Code, there are some special points. LZ is a simple description. 1. The decomposed problem is handled in parallel, and we set the asynDepth parameter to control the parallel depth. Generally, the depth is (the number of log2CPU cores. 2. When Sub-problems are not processed in parallel, Parallel merge Sorting calls the common merge Sorting method, such as MergeSort. sort and MergeSort. merge. 3. Because the merge operation depends on the completion of two sub-problems, we have set a merge signal (mergeSignal). Only when the signal is sent Will the merge operation be performed. In principle, Parallel Merge Sorting is the same as normal Merge Sorting, but it uses parallel processing for sub-problems to a certain extent. Therefore, if you understand the Merge Sorting, therefore, Parallel Merge Sorting is not difficult to understand. Performance Analysis of Parallel Merge Sorting only processes some parallel operations in common Merge Sorting. Therefore, there is no qualitative change in the overall time complexity, all are O (n * log2n ). Because Parallel Merge Sorting performs some sort operations in parallel, the performance is faster than that of the common Merge Sorting Algorithm. However, this is not certain. When the array size is too small, the performance improvement brought by parallelism may be less than the overhead of thread creation and destruction, in this case, the performance of Parallel Merge Sorting may be lower than that of normal Merge Sorting. Next to the one-week computing forest conference, the computing forest conference will be attended by the above three algorithms. The winner will become the most popular algorithm this week. The following is the code of computing conference. Package algorithm; import java. io. file; import java. lang. reflect. method; import java. util. random;/*** @ author zuoxiaolong ***/public class SortTests {public static void main (String [] args) {testAllSortIsCorrect (); testComputeTime ("MergeParallelSort", 40000, 5); testComputeTime ("MergeSort", 40000, 5); testComputeTime ("InsertSort", 400, 5);} public static void testAllSortIsCorrect () {File classp Ath = new File (SortTests. class. getResource (""). getFile (); File [] classesFiles = classpath. listFiles (); for (int I = 0; I <classesFiles. length; I ++) {if (classesFiles [I]. getName (). endsWith ("Sort. class ") {System. out. println ("--- test" + classesFiles [I]. getName () + "valid ---"); testSortIsCorrect (classesFiles [I]. getName (). split ("\\. ") [0]) ;}} public static void testSortIsCorrect (String className ){ For (int I = 1; I <50; I ++) {int [] numbers = getRandomIntegerArray (1000 * I); invoke (numbers, className ); for (int j = 1; j <numbers. length; j ++) {if (numbers [j] <numbers [J-1]) {throw new RuntimeException (className + "sort is error because" + numbers [j] + "<" + numbers [J-1]) ;}} System. out. println ("---" + className + "tested valid ---");} public static void testComputeTime (String className, in T initNumber, int times, Object... arguments) {long [] timeArray = new long [times]; for (int I = initNumber, j = 0; j <times; I = I * 10, j ++) {timeArray [j] = computeTime (I, className, arguments);} System. out. print (className + "time increase ratio:"); for (int I = 1; I <timeArray. length; I ++) {System. out. print (float) timeArray [I]/timeArray [I-1]); if (I <timeArray. length-1) {System. out. print (",") ;}} S Ystem. out. println ();} public static long computeTime (int length, String className, Object... arguments) {int [] numbers = getRandomIntegerArray (length); long start = System. currentTimeMillis (); System. out. print ("start to calculate the length as" + numbers. length + "method" + className + "parameter is ["); for (int I = 0; I <arguments. length; I ++) {System. out. print (arguments [I]); if (I <arguments. length-1) {System. out. print (",") ;}} example E M. out. print ("], time is"); invoke (numbers, className, arguments); long time = System. currentTimeMillis ()-start; System. out. println (time + "ms"); return time;} public static int [] getRandomIntegerArray (int length) {int [] numbers = new int [length]; for (int I = 0; I <numbers. length; I ++) {numbers [I] = new Random (). nextInt (length);} return numbers;} public static void invoke (int [] numbers, String clas SName, Object... arguments) {try {Class <?> Clazz = Class. forName ("algorithm." + className); Class <?> [] ParameterTypes = new Class <?> [Arguments. length + 1]; parameterTypes [0] = int []. class; for (int I = 0; I <arguments. length; I ++) {parameterTypes [I + 1] = arguments [I]. getClass ();} Method method = clazz. getDeclaredMethod ("sort", parameterTypes); Object [] parameters = new Object [parameterTypes. length]; parameters [0] = numbers; for (int I = 0; I <arguments. length; I ++) {parameters [I + 1] = arguments [I];} method. invoke (null, Parameters) ;}catch (Exception e) {throw new RuntimeException (e) ;}} the above Code testAllSortIsCorrect method first verifies the correctness of the three algorithms, that is, after the sort method, whether the array is in ascending order. It should be noted that because the insertion sorting performance is too low, the maximum number of insert sorting tests is 4 million, and the maximum number of Merge Sorting tests is 0.4 billion. Next, let's take a look at the running results. The following figure shows the running result on mac pro of LZ. The hardware configuration is 16 GB memory and 4-core i7. In this configuration, the asynchronous depth (asynDepth) is log24 = 2 by default. --- Test InsertSort. whether the class is valid ------ InsertSort is tested and valid ------ test MergeParallelSort. whether the class is valid ------ MergeParallelSort is tested and valid ------ test MergeSort. whether the class is valid ------ MergeSort is tested to be valid --- start to calculate the length of the 40000 method. The MergeParallelSort parameter is [], and the start time is 6 ms. The length of the calculation is 400000. The MergeParallelSort parameter is []. the formula for calculating the length starting from 44ms is MergeParallelSort [], and the formula for calculating the length starting from 390ms is MergeParallelSort []. the calculation duration starts from 400000000 Ms. The MergeParallelSort parameter is [] and the time is 47168msM. ErgeParallelSort increase ratio: 7.3333335, 8.863636, 9.9282055, 12.181818. The length is calculated as 40000. The MergeSort parameter is [], the formula for calculating the length starting from 7 ms is MergeSort [], and the formula for calculating the length starting from 81ms is MergeSort []. the calculation duration starts from 839ms. The calculation duration is []. The calculation duration starts from 9517ms. The calculation duration is 40000000. The calculation duration is []. The calculation duration is 104760msMergeSort. The increase ratio is as follows: the calculation duration of 11.571428, 10.358025, 11.343266 is 11.00767. The InsertSort parameter is [], and the calculation duration of 0 ms is []. the calculation duration starts from 3 ms and is 40000 square meters. The InsertSort parameter is [], the time is calculated from 245ms, the length is 400000, The InsertSort parameter is [], and the time is calculated from 23509ms. The length is 4000000, And the InsertSort parameter is []. when the time is 3309180msInsertSort, the ratio is increased: Infinity, 81.666664, 95.9551, and 140.76227. You can see that the three algorithms run correctly first. Next, we can compare the performance of the three algorithms. Based on the output results, the difference between a scale of 4 million is the most obvious and intuitive. Parallel Merge Sorting requires only 4 million ms to complete the sorting of a scale of more than 3 million, while general Merge Sorting requires 8 39 ms. As for insertion sorting, it is simply unreasonable and ms is required, about 50 minutes. Let's look at the time growth trend of the three. The two types of Merge Sorting are basically similar to the growth trend of scale. When the scale increases by 10 times, the time is also basically increased by 10 times, insert sorting is increasing at a speed of almost 100 times, which is just the square of the growth rate of the array. Infinity is calculated because the millisecond-level timing is 0 ms when the array size is 400. Therefore, when the divisor is 0, the result is Infinity. Of course, the results are random. You can experiment several times on your computer, but inserting the sorting time is really cool.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.