Achieve front-end high-performance computing sharing

Source: Internet
Author: User
Tags emcc emscripten intel core i7
one of the front-end high performance calculations: WebworkersWhat is Webworkers

Simply put, Webworkers is a HTML5 new Api,web developers can use this API to run a script in the background without blocking the UI, which can be used to do things that require a lot of computation and take full advantage of CPU multicore.

Now the browser basically supports webworkers.

Parallel.js

The direct use of the Webworkers interface is still too cumbersome, fortunately someone has already encapsulated it: parallel.js.

Note Parallel.js can be installed via node:

$ NPM Install Paralleljs

But this is the node's cluster module used under node. js. If you want to use it in a browser, you need to apply JS directly:

<script src= "Parallel.js" ></script>

You can then get a global variable, Parallel. Parallel provides a map and reduce two functional programming interfaces, which can be very convenient for concurrent operation.

We first define our problem, because the business is more complex, I here to simplify the problem to 1-1,0000,0000, and then subtract 1-1,0000,0000 in turn, the answer is obvious: 0! This is because the number is too large to have the data accuracy problem, the results of the two methods will have some differences, it will make people feel that the parallel method is unreliable. This issue in my Mac Pro chrome61 directly running JS Run is probably 1.5s (our actual business problem requires 15s, here to avoid user testing when the browser to kill, we simplify the problem).

Const N = total number of 100000000;//100 million//update since 2017-10-24 16:47:00//code has no meaning, purely to simulate a time-consuming calculation, directly with//for (Let i = start; I <= end; I + = 1) Total + = i;//There are a few problems, one is that the code is too simple without any slightly more complicated operation, the subsequent use of C code optimization is very exaggerated, can not be compared. The second is the data overflow problem, I am too lazy to deal with this problem, the following code is simply added first, and then lost, the answer is obviously 0, easy to test.  function sum (start, end) {Let total = 0;    for (Let i = start; I <= end; i + = 1) {if (i% 2 = = 0 | | I% 3 = = 1) {Total + = i;    } else if (i% 5 = = 0 | | I% 7 = = 1) {total + = I/2;    }} for (Let i = start; I <= end; i + = 1) {if (i% 2 = = 0 | | I% 3 = = 1) {total = i;    } else if (i% 5 = = 0 | | I% 7 = = 1) {total = I/2; }} return total;} function Parasum (N) {Const N1 = n/10;//We are divided into 10 points, not divided into a web worker,parallel.js based on the number of CPU cores of the computer to establish an appropriate amount of workers let P = new Para  Llel ([1, 2, 3, 4, 5, 6, 7, 8, 9, ten]). require (sum);      return P.map (n + = SUM ((n-1) * 10000000 + 1, n * 10000000))//In parallel.js it is not able to apply external variables directly N1. Reduce (data =      Const ACC = data[0];      Const E = data[1]; return ACC + E; });} Export {N, sum, parasum}

The code is simple, I'm talking about a few of the pits I encountered when I first used them.

Require all the required functions

For example, in the appeal code with sum, you need to advance require (sum), if the sum of the use of another function f, you also need to require (f), also if f in the use of G, you also need require (g), Until you require all the defined functions that are used ....

Can't require variables

We have an appeal code that I've defined N1, but I can't use it.

ES6 compiled into ES5 after the problem and Chrome did not error

At the beginning of the actual project we used the ES6 feature: array deconstruction. Originally this is very simple feature, now most of the browser has been supported, but I then configured Babel will be compiled into ES5, so will generate code _slicedtoarray, you can babel test online, and then Chrome is never work, Also did not have any error message, looked for a long time, later in Firefox opened, there are error messages:

Referenceerror: _slicedtoarray is not defined

It seems that Chrome is not omnipotent ...

You can test on this demo page, the speed is about 4 times times, of course, you have to look at the computer CPU's nuclear number. In addition I later on the same computer Firefox55.0.3 (64-bit) test, appeal code incredibly just 190ms!!! Under the Safari9.1.1 is also around 190ms ...

Refers

    • Https://developer.mozilla.org/zh-CN/docs/Web/API/WebWorkersAPI/Usingwebworkers

    • https://www.html5rocks.com/en/tutorials/workers/basics/

    • https://parallel.js.org/

    • https://johnresig.com/blog/web-workers/

    • Http://javascript.ruanyifeng.com/htmlapi/webworker.html

    • Http://blog.teamtreehouse.com/using-web-workers-to-speed-up-your-javascript-applications

Front-end high-performance Computing II: Asm.js & webassembly

We talked about two ways to solve high-performance computing, one is concurrent with webworkers, and the other is using a lower-level static language.

In 2012, Mozilla's engineer, Alon Zakai, had a whim when studying the LLVM compiler: Can I compile c/C + + into JavaScript and try to achieve native code as fast as possible? So he developed the Emscripten compiler, which compiles C + + code into a subset of JavaScript asm.js, with a performance of almost 50% of the native code. Everyone can look at this ppt.

Google then developed the [portable Native Client][pnaci], which is a technique that allows the browser to run C + + code. Later estimated that everyone did not do it, but Google, Microsoft, Mozilla, Apple and several other large companies to work together to develop a universal web-oriented binary and text Format project, that is WebAssembly, the official website is:

Reference

WebAssembly or WASM is a new portable, Size-and load-time-efficient format suitable for compilation to the Web.

So, webassembly should be a good prospect for the project. We can look at the current browser support situation:

Installing Emscripten

Visit https://kripken.github.io/emscripten-site/docs/getting_started/downloads.html

1. Download the SDK for the corresponding platform version

2. Get the latest version of the tool via EMSDK

Bash  # Fetch The latest registry of available tools.  /EMSDK Update  # Download and install the latest SDK tools.  /EMSDK Install latest  # make the ' latest ' SDK "active" for the current user. (writes ~/.emscripten file)  . /EMSDK Activate latest  # Activate PATH and other environment variables in the current terminal  source./emsdk_env. Sh

3. Add the following to the environment variable path

~/emsdk-portable~/emsdk-portable/clang/fastcomp/build_incoming_64/bin~/emsdk-portable/emscripten/incoming

4. Other

I encountered an error in execution when said LLVM version is not correct, later reference document configuration Llvm_root variable is good, if you do not encounter problems, you can ignore.

Llvm_root = Os.path.expanduser (os.getenv (' LLVM ', '/home/ubuntu/a-path/emscripten-fastcomp/build/bin '))

5. Verify that the installation is good

Perform emcc-v, if installed, the following information will appear:

EMCC (Emscripten gcc/clang-like replacement + linker emulating GNU LD) 1.37.21clang version 4.0.0 (https://github.com/krip Ken/emscripten-fastcomp-clang.git 974b55fd84ca447c4297fc3b00cefb6394571d18) (https://github.com/kripken/ Emscripten-fastcomp.git 9e4ee9a67c3b67239bd1438e31263e2e86653db5) (Emscripten 1.37.21:1.37.21) target:x86_64- Apple-darwin15.5.0thread Model:posixinstalleddir:/users/magicly/emsdk-portable/clang/fastcomp/build_incoming_64 /bininfo:root: (emscripten:running sanity Checks)

Hello, webassembly!.

Create a file hello.c:

#include <stdio.h>int main () {  printf ("Hello, webassembly!\n");  return 0;}

To compile C + + code:

EMCC hello.c

The above command generates a a.out.js file that we can execute directly with node. JS:

Node A.out.js

Output:

Hello, webassembly!.

In order for the code to run in a Web page, executing the following command generates Hello.html and hello.js two files, where hello.js and a.out.js content are identical.

EMCC Hello.c-o hello.html

➜  webasm-study MD5 a.out.jsmd5 (a.out.js) = D7397f44f817526a4d0f94bc85e46429➜  webasm-study MD5 hello.jsMD5 ( hello.js) = d7397f44f817526a4d0f94bc85e46429

Then open hello.html in the browser, you can see the page:;;

The code generated earlier is Asm.js, after all, Emscripten is the author Alon Zakai first used to generate Asm.js, the default output Asm.js is not surprising. Of course, you can generate wasm by option, generating three files: hello-wasm.html, Hello-wasm.js, Hello-wasm.wasm.

EMCC hello.c-s Wasm=1-o hello-wasm.html

Then the browser opens hello-wasm.html and finds an error typeerror:failed to fetch. The reason is that the wasm file is loaded asynchronously via XHR, and the file:////access will cause an error, so we need to start a server.

NPM install-g Serveserve.

Then visit http://localhost:5000/hello-wasm.html and you'll see the normal results.

To call C + + functions

In front of Hello, webassembly! is the main function directly out, and we use WebAssembly for high-performance computing, most of the practice is to implement a function in C + + time-consuming calculation, and then compiled into Wasm, exposed to JS to invoke.

Write the following code in the file ADD.C:

#include <stdio.h>int Add (int a, int b) {  return a + b;} int main () {  printf ("A + B:%d", add (1, 2));  return 0;}

There are two ways to expose the Add method to the JS call.

Exposing APIs via command line parameters

Emcc-s exported_functions= "[' _add ']" Add.c-o add.js

Note The method name must be added before add _. Then we can use this in node. JS:

File Node-add.jsconst Add_module = require ('./add.js '); Console.log (' Add ', ' number ', [' Number ', ' Number '], [2, 3]);

Execution of node node-add.js outputs 5. If you need to use it on a Web page, execute:

Emcc-s exported_functions= "[' _add ']" Add.c-o add.html

Then add the following code to the generated add.html:

<button onclick= "Nativeadd ()" >click</button>  <script type= ' text/javascript ' >    function Nativeadd () {      const result = Module.ccall (' Add ', ' number ', [' number ', ' number '], [2, 3]);      alert (result);    }  </script>

Then click on the button and you'll see the results of the execution.

Module.ccall will directly call the C + + code method, the more general scenario is that we get to a packaged function, can be repeated in the JS inside the call, which requires the use of Module.cwrap, specific details can be seen in the document.

Const CADD = add_module.cwrap (' Add ', ' number ', [' number ', ' number ']), Console.log (CAdd (2, 3)); Console.log (CAdd (2, 4));

Add emscripten_keepalive when defining a function

Add File add2.c.

#include <stdio.h> #include <emscripten.h>int emscripten_keepalive Add (int a, int b) {  return a + b;} int main () {  printf ("A + B:%d", add (1, 2));  return 0;}

Execute command:

EMCC Add2.c-o add2.html

Also add code in the add2.html:

<button onclick= "Nativeadd ()" >click</button>  <script type= ' text/javascript ' >    function Nativeadd () {      const result = Module.ccall (' Add ', ' number ', [' number ', ' number '], [2, 3]);      alert (result);    }  </script>

However, when you click on the button, the error:

Assertion Failed:the Runtime was exited (use No_exit_runtime to keep it alive after main () exits)

You can resolve this by adding emscripten_exit_with_live_runtime () in Main ():

#include <stdio.h> #include <emscripten.h>int emscripten_keepalive Add (int a, int b) {  return a + b;} int main () {  printf ("A + B:%d", add (1, 2));  Emscripten_exit_with_live_runtime ();  return 0;}

Alternatively, you can add-s no_exit_runtime=1 directly to the command line to resolve

EMCC Add2.c-o add2.js-s no_exit_runtime=1

But a warning will be reported:

Exit (0) implicitly called by end of Main (), but noexitruntime, so is exiting the runtime (you can use EMSCRIPTEN_FORCE_EX It, if you want to force a true shutdown)

Therefore, the first method is recommended.

The code generated above is Asm.js, you can generate WASM by adding-S Wasm=1 in the compilation parameters, and then using the same methods.

Perform time-consuming calculations with Asm.js and webassembly

We've done all the preparation before, so let's try to use C code to optimize the problem mentioned in the previous article. The code is simple:

File sum.c#include <stdio.h>//#include <emscripten.h>long sum (long start, long end) {  long total = 0;< C1/>for (Long i = start; I <= end; i + = 3) {Total    + = i;  }  for (long i = start; I <= end; i + = 3) {Total    = i;  }  return total;} int main () {  printf ("sum (0, 1000000000):%ld", sum (0, 1000000000));  Emscripten_exit_with_live_runtime ();  return 0;}

Note that when you compile with GCC, you need to comment out the two lines of code associated with Emscriten, otherwise compile. Let's just compile the native code directly with GCC to see how many blocks are running?

➜  webasm-study gcc sum.c➜  webasm-study time/a.outsum (0, 1000000000): 0./a.out  5.70s user 0.02s system 99% C Pu 5.746 total➜  webasm-study gcc-o1 sum.c➜  webasm-study time./a.outsum (0, 1000000000): 0./a.out  0.00s User 0.00s system 64% CPU 0.003 total➜  webasm-study gcc-o2 sum.c➜ webasm-study time  ./a.outsum (0, 1000000000): 0./a.ou T  0.00s user 0.00s system 64% CPU 0.003 Total

You can see that there is no optimization difference or very large, optimized code execution time is 3ms!. Really? Think about it, I have a for loop 1 billion times, each for execution is about two additions, two assignments, one comparison, and I did two times for the loop, which means at least 10 billion operations, and my Mac Pro is 2.5 GHz Intel Core i7, So 1s should also execute 2.5 billion CPU instruction operation, how can the inverse day to this degree, certainly is what wrong. Think of a rust test performance article that I saw earlier, saying that Rust worked out the answer directly at compile time and then wrote the results directly into the compiled code, not knowing if GCC did something like that. What is the principle of-O1-O2-O3 optimization in the knowledge of upper GCC? In this article, there is really loop-invariant code motion (LICM) for the optimization, so I add some if the code to judge, I hope to "fool" the optimization of GCC.

#include <stdio.h>//#include <emscripten.h>//long emscripten_keepalive sum (long start, long end) {Long sum ( Long start, long end) {  long total = 0;  for (long i = start; I <= end; i + = 1) {    if (i% 2 = = 0 | | I 3 = = 1) {Total      + = i;    } else if (i% 5 = = 0 || I% 7 = = 1) {Total      + = I/2;    }  }  for (long i = start; I <= end; i + = 1) {    if (i% 2 = = 0 | | I% 3 = = 1) {Total      = i;    } else if (i% 5 = = 0 | | I% 7 = = 1) {total-      = I/2;    }  }  return total;} int main () {  printf ("sum (0, 1000000000):%ld", sum (0, 100000000));  Emscripten_exit_with_live_runtime ();  return 0;}

The results will probably be a bit more normal.

➜  webasm-study gcc-o2 sum.c➜  webasm-study time/a.outsum (0, 1000000000): 0./a.out  0.32s User 0.00s System 99% CPU 0.324 Total

OK, let's compile it into asm.js.

#include <stdio.h> #include <emscripten.h>long emscripten_keepalive sum (long start, long end) {//Long sum ( Long start, long end) {  long total = 0;  for (long i = start; I <= end; i + = 1) {    if (i% 2 = = 0 | | I 3 = = 1) {Total      + = i;    } else if (i% 5 = = 0 || I% 7 = = 1) {Total      + = I/2;    }  }  for (long i = start; I <= end; i + = 1) {    if (i% 2 = = 0 | | I% 3 = = 1) {Total      = i;    } else if (i% 5 = = 0 | | I% 7 = = 1) {total-      = I/2;    }  }  return total;} int main () {  printf ("sum (0, 1000000000):%ld", sum (0, 100000000));  Emscripten_exit_with_live_runtime ();  return 0;}

Perform:

EMCC Sum.c-o sum.html

Then add the code in the sum.html

<button onclick= "Nativesum ()" >NativeSum</button> <button onclick= "Jssumcalc ()" >jssum</button      > <script type= ' text/javascript ' > Function nativesum () {t1 = Date.now ();      Const RESULT = Module.ccall (' Sum ', ' number ', [' number ', ' number '], [0, 100000000]);      T2 = Date.now ();    Console.log (' Result: ${result}, Cost time: ${t2-t1} ');      } </script> <script type= ' text/javascript ' > Function jssum (start, end) {Let total = 0;        for (Let i = start; I <= end; i + = 1) {if (i% 2 = = 0 | | I% 3 = = 1) {Total + = i;        } else if (i% 5 = = 0 | | I% 7 = = 1) {total + = I/2;        }} for (Let i = start; I <= end; i + = 1) {if (i% 2 = = 0 | | I% 3 = = 1) {total = i;        } else if (i% 5 = = 0 | | I% 7 = = 1) {total = I/2;    }} return total;      } function Jssumcalc () {Const N = 100000000;//Total number of times 100 million T1 = Date.now (); Result= Jssum (0, N);      T2 = Date.now ();    Console.log (' Result: ${result}, Cost time: ${t2-t1} '); } </script>

In addition, we modify to compile into webassembly look at the effect?

EMCC Sum.c-o sum.js-s Wasm=1

Browser webassembly Asm.js Js
Chrome61 1300ms 600ms 3300ms
Firefox55 600ms 800ms 700ms
Safari9.1 Not supported 2800ms Because I don't support ES6 I'm too lazy to rewrite no tests

I feel Firefox is a bit unreasonable ah, the default JS is too strong. And then think webassembly is not particularly strong ah, suddenly found EMCC compile time did not specify the optimization option-O2. One more time:

Emcc-o2 sum.c-o sum.js # for Asm.jsemcc-o2 sum.c-o sum.js-s wasm=1 # for webassembly

Browser Webassembly-o2 Asm.js-o2 Js
Chrome61 1300ms 600ms 3300ms
Firefox55 650ms 630ms 700ms

There was no change, disappointed. So-called Asm.js can reach the speed of the native 50%, it seems to have reached. But this year compiling for the Web with WebAssembly (Google I/O ' 17) said WebAssembly is 1.2x slower than native code, feeling wrong. Asm.js Another advantage is that it is JS, so even if the browser does not support, but also as a different JS execution, but no acceleration effect. Of course, WebAssembly by the major manufacturers unanimously respected, as a new standard, it is sure that the future will be better, look forward to a better performance.

Refers

AI is an absolute hot spot in the last two years, and one of the most important reasons for the revival of artificial intelligence is the increased computational power that relies on the GPU. Nvidia's share price soared several times last year, and the better GPUs in the market were generally not available because they were all bought by deep learning and the people who dug bitcoins.

The above content is to achieve the front-end high-performance computing sharing, hope to help everyone.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.