Strengths and weaknesses
The advantage of using Parallel.Invoke is that it's easy to do a lot of things with it, without worrying about tasks or threading problems. However, it is not suitable for all scenarios. Parallel.Invoke has a lot of disadvantages.
If you use it to start a method that takes a long time to execute, it will take a long time to return. This can cause many cores to remain idle for a long time. Therefore, it is important to measure execution speed and logical core usage when using this method.
It has a limit on the scalability of parallelism because it can only invoke a fixed number of delegates. In the previous example, if you were executing on a computer with 16 cores, it would only launch four methods in parallel. As a result, 12 logical cores remain idle.
The additional overhead is added each time a parallel method is executed using this method.
Just like any parallel code, there are internal dependencies and difficult-to-control interactions between different methods, which can lead to difficult-to-detect parallel bugs and unpredictable side effects. However, this disadvantage is the use of all parallel code, it is not the use of Parallel.Invoke is only a problem.
There is no guarantee of the execution order of the methods that require parallelism, so parallel.invoke is not suitable for executing complex algorithms that require a particular execution plan.
Any delegate that is started with a different parallel execution plan can throw an exception, so capturing and handling these exceptions can be more complex than traditional serial code.
Interleaved Concurrency and concurrency
As you can see in the preceding example and picture 2-5, interleaved concurrency and concurrency are different things.
Interleaved concurrency means that you can start, execute, and end different parts of the code over an overlapping period of time. Interleaved concurrency can even run on a computer with only one logical core. When a lot of code runs on a computer with only one logical core, the time-scheduling strategy and fast context switching provide the illusion of parallel execution. However, with this kind of hardware, interleaved execution of this code requires more time than executing separate code on one end alone, because these concurrent code requires competing hardware resources. You can talk about staggered parallelism and imagine multiple trucks sharing the same lane. This is why interleaved concurrency is also defined as a kind of virtual parallelism.
Concurrency means that different code can be executed at the same time, taking full advantage of the concurrency of the underlying hardware. True concurrency cannot occur on a computer that has only one logical core. In order to execute parallel code you need at least two logical cores. When real concurrency occurs, the execution speed can be increased, because parallel execution of code can reduce the time overhead required to complete a particular algorithm. The diagram in front provides the following possible concurrency scenarios:
concurrency, ideal parallelism for four logical cores-in this ideal scenario, the instructions of the four methods are executed on a different logical core respectively.
Combined with interleaved concurrency and concurrency, imperfect parallel four methods can only take advantage of two logical cores-sometimes, the execution of four methods runs in parallel on different logical cores, sometimes they must wait for their time slices. In this case, interleaved concurrency combines true parallelism. This is the most common situation, because even in real-time operating systems, it is really difficult to achieve perfect parallelism.
Figure 2-5
Sequential code translates to parallel code
In the past decade, most of the code written in C # has been executed sequentially and synchronously. Therefore, many of the algorithm's ideas are neither concurrent nor parallel. Most of the time, it's hard to find a way to fully translate into completely parallel and perfectly scalable code. Even though it can be found, this is not the most common scenario.
When you have parallel code and want to take advantage of potential parallelism to speed up execution, you have to find a hotspot area that can be parallel. You can then turn them into parallel code, test execution speed, identify potential scalability, and ensure that no new bugs are introduced when the existing sequential code is converted to parallel code.
Detecting parallel hotspots
Listing 2-3 shows an example of a simple console application that performs the following two sequential methods.
Generateaeskeys-This method executes a for loop and generates the corresponding number of AES keys based on the specified constant field Num_aes_keys. It uses the GenerateKey method provided by the System.Security.Cryptography.AesManaged class. Once this key is generated, the byte data is converted into a hexadecimal string and the result of the conversion is stored in the local variable hexstring.
Generatemd5hashes-This method executes a for loop and uses the MD5 algorithm to generate the corresponding number of hashes based on the given constant num_md5_hashes. It uses the user name as a parameter to invoke the ComputeHash method provided by the System.Security.Cryptography.MD5 class. Once a hash is generated, the byte array is converted to a 16-binary string and saved using the local variable hexstring.
LISTING 2-3: Simple serial AES keys and MD5 hash generators
Using System;
Using System.Collections.Generic;
Using System.Linq;
Using System.Text;
Added for the Stopwatch
Using System.Diagnostics;
Added for the Cryptography classes
Using System.Security.Cryptography;
This namespace'll be used later to run code in parallel
Using System.Threading.Tasks;
Namespace Listing2_3
{
Class Program
{
Private Const int Num_aes_keys = 800000;
Private Const int num_md5_hashes = 100000;
private static string converttohexstring (byte[] byteArray)
{
Convert the byte array to hexadecimal string
var sb = new StringBuilder (bytearray.length);
for (int i = 0; i < bytearray.length; i++)
{
Sb. Append (Bytearray[i]. ToString ("X2"));
}
Return SB. ToString ();
}
private static void Generateaeskeys ()
{
var sw = stopwatch.startnew ();
var aesm = new aesmanaged ();
for (int i = 1; I <= Num_aes_keys; i++)
{
Aesm.generatekey ();
Byte[] result = Aesm.key;
String hexstring = converttohexstring (result);
Console.WriteLine ("AES KEY: {0}", hexstring);
}
Debug.WriteLine ("AES:" + SW.) Elapsed.tostring ());
}
private static void Generatemd5hashes ()
{
var sw = stopwatch.startnew ();
var md5m = MD5. Create ();
for (int i = 1; I <= num_md5_hashes; i++)
{
byte[] data =
Encoding.Unicode.GetBytes (
Environment.username + i.tostring ());
Byte[] result = Md5m.computehash (data);
String hexstring = converttohexstring (result);
Console.WriteLine ("MD5 HASH: {0}", hexstring);
}
Debug.WriteLine ("MD5:" + SW. Elapsed.tostring ());
}
static void Main (string[] args)
{
var sw = stopwatch.startnew ();
Generateaeskeys ();
Generatemd5hashes ();
Debug.WriteLine (SW. Elapsed.tostring ());
Display the results and wait for the user to press a key
Console.ReadLine ();
}
}
}
LISTING 2-3: Simple serial AES keys and MD5 hash generatorsusing system;using system.collections.generic;using system.linq ; using system.text;//Added for the stopwatchusing system.diagnostics;//Added for the Cryptography classesusing system.se Curity. cryptography;//This namespace'll be used later to run code in parallelusing system.threading.tasks;namespace listing2_3 {Class Program {private Const int num_aes_keys = 800000; Private Const int num_md5_hashes = 100000; private static string converttohexstring (byte[] byteArray) {//Convert the Byte array to hexadecimal St Ring var sb = new StringBuilder (bytearray.length); for (int i = 0; i < bytearray.length; i++) {sb. Append (Bytearray[i]. ToString ("X2")); } return SB. ToString (); } private static void Generateaeskeys () {var SW = stopwatch.startnew (); var aesm = new Aesmanaged (); for (int i = 1; I <= Num_aes_keys; i++) {Aesm.generatekey (); Byte[] result = Aesm.key; String hexstring = converttohexstring (result); Console.WriteLine ("AES KEY: {0}", hexstring); } Debug.WriteLine ("AES:" + SW.) Elapsed.tostring ()); } private static void Generatemd5hashes () {var SW = stopwatch.startnew (); var md5m = MD5. Create (); for (int i = 1; I <= num_md5_hashes; i++) {byte[] data = Encoding.Unicode.Get Bytes (Environment.username + i.tostring ()); Byte[] result = Md5m.computehash (data); String hexstring = converttohexstring (result); Console.WriteLine ("MD5 HASH: {0}", hexstring); } Debug.WriteLine ("MD5:" + SW. Elapsed.tostring ()); } static void Main (string[] args) { var sw = stopwatch.startnew (); Generateaeskeys (); Generatemd5hashes (); Debug.WriteLine (SW. Elapsed.tostring ()); Display the results and wait for the user to press a key console.readline (); } }}
The For loop in method Generateaeskeys does not use its control variable i in the code because it only controls the number of times a random AES key is generated. However, in method Generatemd5hashes, its control variable i is added behind the computer user name. This string is then used as the input data for the method that calls the resulting hash value, as shown in the code below
for (int i = 1; I <= num_md5_hashes; i++)
{
byte[] data = Encoding.Unicode.GetBytes (Environment.username + i.tostring ());
Byte[] result = Md5m.computehash (data);
String hexstring = converttohexstring (result);
Console.WriteLine (hexstring);
}
for (int i = 1; I <= num_md5_hashes; i++) {byte[] data = Encoding.Unicode.GetBytes (Environment.username + i.tostring () ); Byte[] result = Md5m.computehash (data); String hexstring = converttohexstring (result); Console.WriteLine (hexstring);}
The highlighted line of code in listing 2-3 is the total time to measure the execution of each method. It starts a new stopwatch by calling the StartNew method at the beginning of each method, and then eventually writes the consumed time to the debug output.
The statements for the key and hash values generated by the commented output are also shown in Listing 2-3, as they send strings to the console, which results in the accuracy of the performance bottleneck impact time test.
Figure 2-6 shows the sequential execution process for this program and the time it takes to perform the first two methods on a computer with a dual-core microprocessor.
Two methods need to be executed for nearly 14 seconds. The first method executes 8s, the latter one requires 6s. Of course, the time spent will vary greatly with the underlying hardware configuration. There is no interaction between the two methods, so they are completely independent of each other. Executing in such a sequential order does not take advantage of the parallel processing power provided by the external core. Therefore, these two methods are a parallel hot zone, where parallelism can help us to achieve a significant execution speed improvement. For example, you can use Parallel.Invoke to perform these two methods in parallel.
Figure 2-6
Measure Parallel Execution speed improvement
Replace the Main method in listing 2-3 with the new version below, where you use Parallel.Invoke to start two methods in parallel.
static void Main (string[] args)
{
var sw = stopwatch.startnew ();
Parallel.Invoke (
() = Generateaeskeys (),
() = Generatemd5hashes ());
Debug.WriteLine (SW. Elapsed.tostring ());
Display the results and wait for the user to press a key
Console.WriteLine ("finished!");
Console.ReadLine ();
}
static void Main (string[] args) {var sw = stopwatch.startnew (); Parallel.Invoke (() = Generateaeskeys (), () = Generatemd5hashes ()); Debug.WriteLine (SW. Elapsed.tostring ()); Display the results and wait for the user to press a key Console.WriteLine ("finished!"); Console.ReadLine ();}
Figure 2-7 shows a new version of the program's parallel execution process and two methods to perform the time spent on a computer using a dual-core microprocessor.
Now two method executions will be nearly 9m, as they take advantage of the two cores provided by the microprocessor. Therefore, you can use the following formula to calculate the speed increase that it can achieve:
Speedup = (Serial execution time)/(Parallel execution time)
14/9 = 1.56x
Figure 2-7
As you can see, Generateaeskeys consumes a longer period of time than Generatemd5hashes: 9:6. However, if all of the delegates do not end up executing, Parallel.Invoke will not execute the code below. So the final 3s, the application does not take advantage of every core, there is a load-imbalance problem, shown in 2-8.
Figure 2-8
If the application runs on a computer with a four-core microprocessor, its speed is almost the same because it cannot dispatch two other cores that use the underlying hardware.
In this example, you detect a parallel hotspot and add some code to test the time spent executing a particular method. Then you can just change a few lines of code to complete an interesting execution speed boost. When the number of cores available in a data parallel scenario increases, you now need to know the command data parallel TPL structure to achieve a better result and improve scalability.
Understanding Parallel execution
Next, you need to dismiss the lines of code that are commented in methods Generatemd5hashes and Generateaeskeys about the output to the console:
Console.WriteLine ("Aeskey: {0}", hexstring);
Console.WriteLine ("Md5hash: {0}", hexstring);
Performance bottlenecks are generated for parallel execution of output to the console. However, this time, it is not necessary to test the exact timing. Instead, you can see the output of the two methods that are executed in parallel. Listing 2-4 shows an example of a console output generated by this program. The highlighted short hexadecimal string in the list is the corresponding MD5 hash. The other hexadecimal strings show the AES key. Each AES key consumes less time than each MD5 hash. Remember that code generates 800000 AES key and 100000 MD5 hash.
List2-4
Now, comment out the code for those two methods that output the results to the console.
Use Parallel.Invoke to parallelize your code