Week7 in the first two classes of the title is "Actors is distributed", said a lot of akka cluster content, but also difficult to understand.
Roland Kuhn does not speak much of the details of how Akka cluster works, but instead focuses on how to use Akka cluster to distribute actors to different nodes, perhaps because Akka cluster can speak too much, And Coursera's class is not enough. However, from the audience's point of view, this class is only a preliminary understanding of what the next Akka cluster can do, but want to use their own, especially want to understand the mechanism of their work, but also need to see a lot of things. After I read a chapter about cluster in the Akka document, I found that I didn't know much more ... All right, no more World of Warcraft!
In view of not knowing the place too much, or in accordance with the course of the order to go through, tidy up the areas that do not understand, and then separate research. Here are the contents of the course, including: the contents of the PPT, some important points in the course of the lecture and some of my understanding.
Roland Kuhn at the beginning of the video, introducing what to say this week:
In this last week of the reactive programming course, we'll focus first on something which is implicitly there alrea Dy. Namely, that actors as independent agents of computation is by default distributed. Normally you run them just on different CPU ' s in the same system. But, there are nothing stopping the running them on different network hosts, and we'll do the this in practice. This means, if your picture for example, people who is living on different continents then it takes some effort for them t O agree on a common truth or a common decision. The same thing is true to actors and we call this eventual consistency. After that, we'll talk on scalability and responsiveness and how to achieve these in an actor-based system. Rounding the four tenets of reactive programming, which were event-driven scalable, resilient, and responsive.
In the last week of this reactive programming course, we will focus on what has already been secretly there. In other words, the actor, as a separate computing unit, is distributed by default. Typically, you'll run them on different CPUs on a system, but there's nothing to stop you from running them on different hosts in the network, and we'll do it again. This (the actor is distributed) means that, with a chestnut, for people living on different continents, they need to spend some effort to agree on a fact or a decision. The same is true for actors, and we call it "final consistency." After that , we will introduce scalability and responsiveness and how to implement them in actor-based systems. The four principles surrounding the reactive programming: Event-driven, scalable, resilent, responsive
Found here to understand the next four tenets, so found an article translated a bit.
The Impact of the Network communication
Compared to In-process communication:
- Data sharing only by value
- Lower Bandwith
- Higher latency
- Partial failure
- Data corruption
Multiple processes on the same machine is quantitatively less impacted, but qualitatively the issues is the same.
Comparing network communication and in-process communication is used to understand how far the actor can lead us in a distributed way.
- When network communication occurs, the data is passed in the form of a value. Objects need to be serialized, transmitted over the network, deserialized, and after these steps, the transferred object is not exactly the same as the original object (after all, it is now in a different JVM). As I said before, a state object whose behavior is related to its history. However, after a stateful object has been transmitted over the network, the state of the two objects on both sides of the network may be different over a period of time. Therefore, only the shared immutable object is meaningful. (meaning that it is meaningless to transmit a mutable state object over the network and then want to share the object on both sides of the network).
- Network bandwidth is much smaller than the bandwidth of a machine's internal data.
- The latency of all communications is much higher. You can invoke a method within one nanosecond, but you cannot pass a network packet in one nanosecond. And in network communication, objects need to be serialized and deserialized, which leads to higher latency than passing references within the process.
- In addition to these differences in quantity, there is a new situation. Is that when network communication occurs, there is a risk of partial failure, or only some of your messages arrive, and you do not know which ones arrived before you receive the response.
- Another type of failure is the data corruption. From the personal experience industry, there is a corruption event for approximately every TB when data is sent over TCP.
Multiple processes on the same machine is quantitatively less impacted, but qualitatively the issues is the same.
Distributed computing breaks assumptions made by the synchronous programming model.
Running multiple processes on the same machine, allowing them to pass, such as localhost interface communication, can alleviate some of the above problems. But the problems in nature still exist.
The overarching description of the problem is it distributed computing breaks the core assumptions made by a synchronous Programming model.
The general description of this problem is that distributed computing breaks the core assumptions of the synchronous programming model.
Actors is distributed
Actor communication is asynchronous, one-way and not guaranteed.
Actor encapsulation makes them look the same, regardless where they live.
Actors is "Location Transparent", hidden behind Actorref.
So far, we've talked a lot about actors, and you know that all of the actors ' communications are asynchronous and unidirectional, and they don't guarantee reliable delivery. So what the actor actually models is what the Web gives us. They model precisely what the network gives us. It can be argued that the actor took the opposite way of "extending the local model to the network". The Actor model examines the network model and then uses it on the local machine.
Another feature of the actor is that they run so independently that they are all the same from the outside. They're all, actorref. No matter where the actors actually are, sending them messages is the same, we call it location transparency.
All the syntax in the feature set provided by the current Akka Actor model is strictly set to be remote for everything. So the features that cannot be modeled in this environment are removed. As a result, using the actor to write a distributed program has the same amount of work as writing a local program. There's no big difference in the code itself, and we'll see this next.
"Principles of Reactive programming" <actors is distributed> (1)