Parsing of asynchronous single threads in JavaScript (graphic)

Source: Internet
Author: User
Tags data structures
This article brings to you the content is about JS asynchronous single-threaded parsing (graphics), there is a certain reference value, a friend can refer to the need, I hope you have help.

For the usual developer (especially those with parallel computing/multi-threaded background knowledge), the asynchronous processing of JS is really weird. And this weird from the result, is the JS "single-threaded" this feature caused.

I tried to explain this piece in a textbook way "define and expand", but found it extremely painful. Because it takes a lot of basic knowledge to figure out the details behind this thing and generalize it to a higher point of view. When I put this knowledge to speak clearly and finish, it is tantamount to forcing the reader to hold the operating system, computer networks, such as hypnosis books to see a few chapters, really dull and boring.

And more crucially, the reader's energies have been exhausted at that point, with no effort to care more about the very first question of why--JS's asynchronous processing is so bizarre.

So, I decided to turn around and let's start out with nothing like a beginner,

first Use the "wrong idea" to start our discussion, then use code to find out where the idea goes against.

Make some more amendments and look at some examples to see if there are any areas of dissatisfaction and clarity that can be adjusted again. In such a way, we will, like detectives, begin with a very incorrect hypothesis, constantly looking for evidence, revising assumptions, and pursuing them until we reach the final complete truth.

I think that this way of writing is more in line with a person's true knowledge and research process, and can bring you more about the "exploration problem" inspiration. I think this way of thinking and research ideas is more important than ordinary knowledge. It allows you to become a hunter of knowledge, able to forage independently, without having to be a baby, and waiting for others to feed.

OK, let's start our exploration journey from a JS code.

Console.log (' No. 1 '); SetTimeout (function () {    console.log (' SetTimeout callback ');}, and Console.log (' No. 2 ');

The output is:

No. 1No. 2setTimeout Callback

There's hardly anything complicated in this code, it's all printed. The only function, in particular, is that setTimeout it accepts two parameters, based on a rough online data display:

    • The first argument is the callback function, which is the function that is called back when it is done.

    • The other is the time parameter, which is used to specify how much subtlety after the callback function is executed. Here we use 5000 subtlety, which is 5 seconds.

Another important point is that it is setTimeout an asynchronous function, which means that my main program does not have to wait for setTimeout execution to complete, throw its running process somewhere else, and then the main program goes down. That is, the main program is a pace, setTimeout is another step, that is, "asynchronous" way to run the code.

If you have some background knowledge of parallel computing or multithreaded programming, then the above statement is familiar. If you are in a multithreaded environment, it is just another thread to run the print statement console.log('setTimeout callback') . Then the main thread continues to go down, new threads to be responsible for printing statements, clear.

So together, this code means that when the main thread executes to the statement setTimeout , it is handed over to the "other place", allowing the "other place" to wait 5 seconds to run. And the main thread continues to go down to perform "No. 2" Printing. So, since the other part waits 5 seconds before running, and the main thread immediately runs down the "No. 2" Print, the final output will be printed "No. 2" before printing "SetTimeout callback".

Well, so far so good. Everything seems to be more beautiful.

What if we make a little change to the above procedure? For example, can I let the "SetTimeout callback" message be printed first? Because in parallel computing, the problem that we often encounter is that because you do not know who executes fast between multiple threads, who executes slowly, we cannot determine the final order of statement execution. Here we let "SetTimeout callback" stay for 5 seconds, the time is too long, or shorter?

Console.log (' No. 1 '); SetTimeout (function () {    console.log (' SetTimeout callback ');}, 1); Console.log (' No. 2 ');

We changed the parameters passed to setTimeout 1 milliseconds. After many runs will find that the results have not changed?! It seems a bit unnatural, or a little bit smaller? Change to 0?

Console.log (' No. 1 '); SetTimeout (function () {    console.log (' SetTimeout callback ');}, 0); Console.log (' No. 2 ');

After many runs, the discovery still cannot change. This is actually a little strange. Because of the usual parallel computing, multithreaded programming, you can actually see a variety of unpredictable results by running multiple times. Here, it magically gets the same execution order results. This is unnatural.

But we are not able to complete the next positive conclusion, but it is impossible because the setTimeout start time is too long, and the "No. 2" statement is executed first? For further verification, we can add a loop before the "No. 2" print statement, for giving setTimeout sufficient time to start.

Console.log (' No. 1 '); SetTimeout (function () {    console.log (' SetTimeout callback ');}, 0); for (Let i = 0; i < 10e8; i++ ) {}console.log (' No. 2 ');

Running this code, we found that "No. 1" This print statement quickly displayed to the browser command line, wait a second or so, and then output the

No. 2setTimeout Callback

Eh?! Isn't that even more strange? setTimeoutnot wait 0 seconds to run immediately, even if the start is slow, not wait a second after, still can not display it properly? Moreover, before joining this for cycle, is the output of "SetTimeout callback" not immediately displayed?

To synthesize these phenomena, we have reason to suspect that it seems that "SetTimeout callback" must be displayed after "No. 2", that is, setTimeout the callback function, which must be console.log('No. 2') executed afterwards. To verify it, we can do a bit of a dangerous test and for change the loop to an infinite while loop.

Console.log (' No. 1 '); SetTimeout (function () {    console.log (' SetTimeout callback ');}, 0); while {}  //Dangerouse Testingconsole.log (' No. 2 ');

If setTimeout the callback function is run at its own pace, it is possible to print out "SetTimeout callback" at some point. And if, as we suspect, "SetTimeout callback" must be ranked "No. 2", then the browser command line will never appear "SetTimeout callback".

After running, it is found that "SetTimeout callback" is still not printed when the browser is near to crashing and memory overflow is reached. This proves our guess!

Here, for the first time, there is a contradiction between the idea and the reality. The callback function, which is thrown into "other places", should be run concurrently, according to the idea of the usual parallel computation setTimeout . But the truth is, this "other place" does not work with the last statement that prints "No. 2". At this time, we have to go back to the basics, back to JS this language the bottom of the implementation of the way up to trace, in order to dig clear this behind the fishy.

One of the features of JS is "single-threaded", that is, from beginning to end, JS is running under the same thread. Perhaps this is a point worth investigating in depth. Come to think, if is multi-threading, then setTimeout also should follow our original idea to carry out, but the fact is not. The difference between the two is single-threaded and multi-threaded.

If we find this difference, we can go deeper and think about some details. To think of it, the so-called "async" is to open up some "somewhere else" and let "somewhere else" run along with your main running route. However, if it is single-threaded, it means that there is only one copy of computing resources, how do you Do "run at the same time"?

It's like, if you go to a lobby, pay water, electricity, gas. Well, we can roughly divide them into water bills, electricity counters, gas counters. Then, if we "first in the water bill to do business, wait until the details of the water bills are printed and paid, and then go to the electricity counter print details, Ghana costs; Go to the gas counter print details, Ghana costs", this is a synchronization process, you must wait for the previous step to finish before you can do the next steps.

and async, which means we don't have to waste time waiting in a certain link. For example, we can go to the electricity and gas counters at the free time of the "Print water bill" and start the task ahead of schedule for "electricity bills, print of gas details". Then go back to pay water, pay the electricity, pay the gas costs. In fact, this is Hua Hua Promotion optimization method of the time to cite examples, boil water, pour tea, tea, how to arrange their order for efficient.

Obviously, it's more efficient to do tasks asynchronously. But there is a premise, that is, you do the task of resources, that is, the work of the person or machine, you have to have more than. Similarly, according to the above example, there are three counters for water, electricity and gas, but if there is only one person behind the three counters? For example, you start a water-handling business, and then you want to go to the electricity counter for the electricity bill in the waiting period for water charges. On the surface, you go to the electricity bill under the application form, request for electricity business, but found that there is no clerk to receive your business! Why? Because there is only one clerk, is still in charge of your water business Ah! At this time, what is your meaning of this so-called "async"?!

So from this point of view, when there is only one copy of the computing resources, it doesn't make sense for you to do "async". Because there is only one resource for the work, even on the surface of the nominal "asynchronous", can eventually be like the above multi-counter single clerk, to the implementation of the task level, or one after another to complete the task, which makes no sense.

So, the feature of JS is "single-threaded" + "asynchronous", not exactly what we are talking about "meaningless" situation?! So why do you have to do something that doesn't make sense many times?

Well...... Things are getting more and more interesting.

Generally speaking, if an event has a magical and weird place, it's basically because we've overlooked a particular detail, or misunderstood or misinterpreted a particular detail. To solve the problem, we must constantly review the existing materials, in the repeated tests, we found that the few things we ignore.

Let's take a look back at the promo movie about JS async. Usually in order to illustrate the necessity of JS Async, will cite the browser's resource loading and page rendering this contradiction.

rendering, can be compared to the rough interpretation of the "picture" to draw out the process. For example, if a browser is to display a button or a picture on a page, it must have an action that draws the picture on the page. Or, the operating system to display the "desktop" this graphical interface on the display, it must be the corresponding "screen" on the display on the action of the picture. It boils down to the "drawing out" process, which is called "rendering".

For example, you click on a button on the page to have the browser go to the backend database to take out the data report and display the numbers on the page. And if JS does not support asynchronous, the entire page will stay, that is, "card", in the mouse click button This action, the page can not complete the subsequent rendering work. Until the backend returns the data to the front end, the program flow can continue to run.

So here, JS's "async" is actually to let the browser will "load" this task to "other places", let "loading process" and "rendering process" synchronization.

Wait, is this "other place" again?!!

I wipe, not to say that JS is a single-threaded and, computing resources is not only a copy, how can You "side load, side rendering"?! WTF, are you kidding me?

艹, in the end, which of the words is true?! Is it true that JS is a single thread? Or is it true that the browser can do "load on one side, Render side"?

How can we solve this puzzle?! Obviously, we have to go into the inside of the browser and see how it was designed.

In the search engine, do some of the browser and JS search, we are not difficult to get some basic information. JS is not the browser of all, the browser to take charge of too many things, in charge of JS is just a browser component, called JS engine. And the most famous, and used in Chrome, is the famous V8 engine, which is responsible for the parsing and operation of JS.

On the other hand we also know that a big reason for using JS is because it is free to manipulate DOM elements, to execute AJAX asynchronous requests, and to use asynchronous task assignments as we did in the first instance setTimeout . These are the JS excellent features.

Surprisingly, when we went to explore the V8 engine that ran JS, we found that it did not provide DOM manipulation, Ajax execution, and setTimeout the features:

From Alexander Zlatkov, its structure is:

    1. JS Engine

      • Memory Heap

      • Call Stack

    2. Web APIs

      • DOM (Document)

      • Ajax (XMLHttpRequest)

      • Timeout (SetTimeout)

    3. Callback Queue

    4. Event Loop

Clearly is the characteristics of JS, why these functions are not by the JS engine to control it? Yes, interesting~~~.

Eh is not "single thread", not the loading process is thrown to other places?! JS is a single-threaded, that is, JS in the JS engine is a single-threaded, can only be divided into a computing resource, but the load of data Ajax This feature is not put into the JS engine?!

艹! True TM is an old fox! Also thought that "single-threaded" and "side loading, side rendering" The two claims that only one is right, but the result is, all right! Why is it? Because only said JS is single-threaded, but not that the browser itself is single-threaded AH! So, the rendering related JS part can be loaded with the Ajax part of the data can be carried out at the same time, because they are in two modules, that is, two threads! So of course can parallel Ah! wtf!

EH ~ Wait, let us take a closer look at the above picture?! Ajax is not in the JS engine, but setTimeout also not in the JS engine Ah!! If this part of Web APIs is in a different thread than the JS engine, will they not be able to achieve true parallelism?! So why do we start with the print message "SetTimeout callback", which cannot be printed in parallel, over "No. 2"?

Well...... It's interesting. Things are not that simple.

Obviously, we need to look at more details, especially, in what order each statement is moved and executed.

When it comes to the execution order of statements, we need to put the focus back on the JS engine again. Looking back at the diagram above, the JS engine contains two parts: one is the memory heap and the other is the call stack. The former about memory allocation, we can temporarily put down. The back is the function stack, well, it's the thing to further understand the order of execution.

Why is the function stack (call stack) called "stack"? Why is it not called a function queue or another god horse? This can actually be inferred from the order in which the functions are executed.

Functions are first introduced, in fact, for code reuse and modularity. We expect a piece of code that should be present to be presented separately, and then just need to use a function call to insert the execution of this code.

So, if we run into a function call when we execute a piece of code, we expect to execute the contents of the function first, and then jump out to the main program flow and proceed.

So, if you think of a function as a function node, the entire execution flow is actually about the "depth first" traversal of the function node, that is, the function call that runs from the main function, the whole method of depth-first traversal is called. In combination with the knowledge of algorithms and data structures, we know that to achieve "deep traversal", you either use recursion or use the data structure of the stack. The latter, however, is undoubtedly more economical to use.

So, since the expected function call is depth-first traversal, and the depth-first traversal requires the data structure of the stack to support it, the structure of maintaining the function call will of course appear as a stack form. So called the function stack (stack).

Of course, if you think about it again, the part of the operating system that is rushing to maintain function calls is called the function stack. So why not implement maintenance in a recursive way? Actually very simple, the computer so does not understand anything, how to know recursion and return? It just indomitable the command. Therefore, in the absence of any auxiliary structure, the ability to execute indomitable is only a stack, not a more complex implementation of recursive concepts.

On the other hand, back to our very first question, the contradiction is actually appearing on setTimeout the callback function. In the above structure, there is also a part called "Callback queue". Obviously, this is part of what we need to know.

Combined with JS's call stack and callback queue two key words, we are not difficult to search for some information to discuss how these two parts are combined with specific statement execution.

First of all, discuss the process as a whole:

    • The normal execution of the statement will be pressed into the call stack one after another, executed, and then pressed into the stack according to the execution of the content.

    • If you encounter a Web APIs-related statement, you will throw the appropriate execution to the Web APIs.

    • Web APIs This way, can be independent of the JS engine, parallel to the allocation of its statements, such as AJAX data loading, setTimeout content.

    • The callback function on this side of the Web APIs will be thrown into the "callback queue" after executing the relevant statements.

    • The Event loop continuously monitors the call stack and the callback queue. When call stack is empty, event loop presses the statement in the "callback queue" into the stack and continues execution.

    • So cyclical.

The above content is more abstract, let us use a concrete example to illustrate. This example also comes from Alexander Zlatkov. The reason to use it is very simple, because Zlatkov in the blog used in the illustration, it is quite clear. And now I have no spare time to use PS to draw the corresponding structure diagram, just take it as an example to illustrate.

Let's examine the following code snippet:

Console.log (' Hi '); SetTimeout (function CB1 () {     console.log (' CB1 ');}, and Console.log (' Bye ');

Haha, in fact, we use the same code, just print the content of the different. At this point, before running, the entire underlying structure is this:

Then, let's execute the first statement console.log('Hi') , which is to press it into the call stack:

The JS engine then executes the top-level statement in the stack. Accordingly, the browser's console will print out the message "Hi":

Since this statement is executed, it also disappears from the stack:

Then press in the second statement setTimeout :

Execution setTimeout(function cb1() { console.log('cb1'); }, 5000); :

Note that because the setTimout part is not included in the JS engine, it is thrown directly to the Web APIs Timeout section. Here, the blue part of the stack plays the role of throwing the corresponding content "timer, wait time 5 seconds, callback function CB1" to Web APIs. The statement can then disappear from the stack:

Continue pressing into the Next statement console.log('Bye') :

Note that at this point in the Web APIs section, the corresponding statement is being executed in parallel to the JS engine, that is: wait 5 seconds. Okay,timer continues its wait, and the stack side already has a statement, so it needs to be executed:

The corresponding browser console will display the message "Bye". The statement that is run in the stack disappears:

At this point, the stack is empty. Event Loop detects that the stack is empty, and naturally wants to press the statement in the callback queue into the stack. At this point, the callback queue is also empty, so event loop has to continue to cycle detection.

On the other hand, the timer on this side of the Web APIs started its execution in parallel in 5 seconds-nothing. Then, put its corresponding callback function cb1() into the callback queue:

The Event loop has been detected in the loop, and when you see the callback queue has something, quickly take it out of the callback queue and press it into the stack:

Now that there's something in the stack, you need to execute the callback function cb1() . And it cb1() calls console.log('cb1') this statement, so it needs to be pressed into the stack:

Stack continues to execute, and now it's on top of it console.log('cb1') , so it needs to be executed first. So the browser controls it to print out the corresponding message "CB1":

The execution of the console.log('cb1') statement popup stack:

Continue execution cb1() of the remaining statements. At this point, cb1() there is no other statement that needs to be executed, that is, it is finished running, so it is also popped from the stack:

The whole process is over! If you look at it from beginning to end, this is the GIF diagram below:

It's pretty clear and intuitive, right!

If you want to further play with JS's statement and call stack, callback queue relationship, recommend Philip Roberts a GitHub open source project: Loupe, there is his online version for you to do a variety of attempts.

With this knowledge, now we look back at the beginning of the code that makes people wonder:

Console.log (' No. 1 '); SetTimeout (function () {    console.log (' SetTimeout callback ');}, 0); Console.log (' No. 2 ');

In the order of the above JS processing statements, the first statement is console.log('No. 1') pressed into the stack and then executed setTimout .

Based on our knowledge above, it will be thrown into web APIs immediately. However, since the wait time we gave it at this time is 0, its Callback function is console.log('setTimeout callback') immediately thrown into the "Callback Queue". So, the legendary "Other place" means the callback queue.

So, can we expect this to console.log('setTimeout callback') be printed before "No. 2"?

In fact, it is impossible! Why? Because for it to be executed, first it needs to be pressed into the call stack. However, at this point, the call stack has not finished executing the statement on the main branch of the program, that is, console.log('No. 2') the statement. Therefore, it is not possible for the event loop to press the callback queue statement into the stack without the stack being empty. Therefore, the last "SetTimeout callback" message, must be ranked in the "No. 2" This message is printed out!

This while is in line with the result of the infinite loop we have previously joined. Because the main branch has been recycled, the while stack is not empty, and the callback queue's print "SetTimeout callback" statement is more unlikely to be pressed into the stack.

Explore here, it seems that the solution to the problem has been solved, as if it can be all right, the direct sealing pen to leave. But the truth is, this is the beginning of our real generalization discussion!

Do research and exploration, if stay in this, it is tantamount to a child to hand over their homework to the teacher, the purpose is only to complete the task assigned by the teacher. Here, the task assigned by the teacher is the confusing code that is presented at the beginning of the article. However, solving this code is not our ultimate goal. We need to generalize what we have learned and know, to explore from a deeper perspective, why we wonder why it is not possible at first to discover that the underlying surface is different. We are going to continue digging, and we have misunderstandings and misconceptions about the most fundamental issues that have led us so hard to see the truth of things at the beginning.

Looking back on our journey, the first thing we did was to put a somersault on the fixed hypothesis of "async" and "Multithreading." Multithreading, is asynchronous, and asynchronous, it must be multi-threading it? We subconsciously want to do a positive answer. This is because if it is asynchronous, but it is single-threaded, the whole async is meaningless (recalling the example of the multi-counter, single clerk). JS is cleverly used: use asynchronous single-threaded to assign tasks, and let the real data load Ajax, or time-settimeout work, thrown to the browser's other threads to do. So, in essence, JS, although single-threaded, can do the actual work, but the use of the browser itself multithreading. This is like, although it is a multi-counter, a single clerk, the clerk will pay for electricity, water, the task of outsourcing to other companies to do, so, although he is still a clerk, but because of the support of outsourcing services, can still be done in parallel.

On the other hand, JS asynchronous, single-threaded characteristics, forcing us to the parallel computing in the "synchronous/asynchronous, blocking/non-blocking" concepts more clearly.

"Synchronization" in English is synchronize, but in the context of Chinese, it is easy and "simultaneous" hook. So, subconsciously there may be such a association, "synchronization" is "simultaneous", so, a synchronous (synchronize) task is understood as "can do a, while doing B." And this subconscious impression, in fact, is completely wrong (generally do a side to do B, in fact, "asynchronous" + "parallel" situation).

But in all kinds of encyclopedia dictionaries, it is really useful to "simultaneous" as a "synchronous" interpretation. What is this for? In fact, this is a confusing understanding of "synchronization" as "simultaneous". If you carefully consider the meaning of "at the same time", there are two types of understanding:

    • At the same moment (at the same time), for example at 9:00am, we are doing both A and B.

    • The other is the same time reference system, the so-called clock on the wall is the same.

The former is easy to understand, here I focus on explaining the latter. For example, I am in mainland China with a classmate of the United States voice chat, my side is 22:00, his side is 9:00. When we were chatting, it was at the same moment (at the same time), but not in the same timeframe (clock on the wall). And in the computer discussion of synchronization, in fact, the discussion is the latter "the same reference system", synchronization, is to let our reference system unified, placed under the same individual system.

Another example, we in life is easy to say, sync your computer, sync your phone address Book, sync your album, say what? is to let your client: PC, mobile phone, and server side of the content of servers are consistent, that is, everyone is put into a consistent reference system inside. Do not say that you have photo A in the PC, but there is no B in the phone, this time, talk about the information in the PC and talk about the information on the phone, the person is in the're same page. The reason is that we did not put everyone in the same reference system.

Therefore, synchronous synchronize refers to the "simultaneous", is the clock on the wall to adjust to the same, adjust to the same pace, that is, at the same time, the meaning of the reference system. Instead of saying, let things happen at the same time. Naturally, what is asynchronous (Asynchronize), asynchronous is everyone's time reference system is different, for example, I am in mainland China, you in the United States, our time reference systems are different, this is asynchronous, not the same pace, band.

In fact, each individual person, each independent of the computing resources, it represents an individual reference system. As long as you distribute the task to other people or other computing resources, there are two reference systems: one is the reference system of the original main branch and the other is the reference system of the new computing resource. In parallel computing, there is a synchronization mechanism that uses the statement barrier, which allows all compute branches to be computed at this point in the node. Why is it a synchronization mechanism? In accordance with our understanding of the unified reference system, it is to ensure that all other computing branches to complete the calculation, but also to ensure the disappearance of other branches, leaving only the main branch of this reference system. So we can talk about the same thing and say the same, there will be no misunderstanding.

On the other hand, if we want to understand JS design more deeply, I think we need to go back to the beginning of computer history, such as the time of the single-core timeshare system. In an era like this, the hardware limitations of the operating system are no less restrictive than that of the JS engine in the browser. Under the same constraints, how did the once operating system skillfully use the extremely limited computational resources to give the entire operating system a smooth, smooth and powerful illusion? I think the JS design must be closely related to the early design of the operating system. So at this level, it will return to the basic knowledge of the operating system again. Can thoroughly understand modern technology, in fact, very much depends on whether you thoroughly understand the history of the design, whether to understand in those resources exhausted in the years, the way the great God is cleverly Sankai, encounter water bridging. No matter how rich the modern computer hardware resources are, it is bound to be limited by the primary and secondary relationship of the target and the primary and secondary relationship of the business. And how to dance and create in the limits, this is a common problem that can run through the whole of history.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.