See http://learnyousomeerlang.com/buckets-of-sockets
For better understanding, the self-translation is as follows. If you have an incorrect understanding or better suggestions, please help us to point out ,:)
Buckets of sockets
So far, we have done some interesting things about Erlang itself, but seldom interact with the outside world, that is, reading and writing files from somewhere at most.
Although the more contact you have with yourself, the more interesting you are, it's time to get out of the comfort zone and talk about the rest of the world.
This chapter contains three components using sockets:Io lists,UDP socketsAndTCP sockets. Io lists is not a strict topic. Io lists is only
It makes sending strings to other sockets or other Erlang drivers more efficient.
Io lists
I mentioned earlier:We can use a string (integer list) or binary (using a binary data structure to save data) to process text, Use "Hello World" or <"Hello World"> to send a message. Similar results are obtained.
The difference is: How do you organize information: A string is a list of integers, each letter is mapped to an element in the list, if you want to add an element in the middle or end of the list, you have to traverse to the position of the element you want to add,This is not what you wantBut:
A = [a]B = [b|A] = [b,a]C = [c|B] = [c,b,a]
In the above example, A, B, and C do not need to be rewritten (replicated). c is equivalent to [C, B, A], [c | B], or [C, | [B | [a]. In the end, [C, | [B | [a], [a] is equivalent to, [c | B] B is the B of the second row. so let's take a look at the differences between the replication and replication?
A = [a]B = A ++ [b] = [a] ++ [b] = [a|[b]]C = B ++ [c] = [a|[b]] ++ [c] = [a|[b|[c]]]
Did you see the rewrite that occurred when the above assignment was completed?
When we create B, we must rewrite a; when we create C, we must rewrite B (including [a]). If we create D in a similar way, we have to rewrite C...
Therefore, it is very inefficient to operate a long string, and a large amount of junk data will be created for Erlang VM to process.
However, binary is not so bad:
A = <<"a">>B = <<A/binary, "b">> = <<"ab">>C = <<B/binary, "c">> = <<"abc">>
In the above case: Binaries knows the length of its own data, so adding an element consumes a fixed amount, which is much better than lists, and binaries is very close (compact, saving space ), for these two reasons, we will continue to use binaries to process text (text ).
Of course, there are also some disadvantages. binaries means that there is only one way to process data, but there is still some consumption to operate binaries (modify or separate, therefore, we will use the string and binaries conversions individually in the code. However, frequent conversions are not recommended.
In the above situation, Io lists is our savior, but Io lists is also a very strange type in the data structure, they are bytes, (integer 0 ~ 255), binaries, or other IO lists. This means that the function can accept Io lists or
For example, [$ H, $ E, [$ L, <"Lo">, ""], [[["W", "O"], <"RL">] | [<"D">] format, erlang VM flates the list into <"Hello World">.
What functions can be used to process Io lists? You can view
1. related module Io, file; 2. TCP and UDP sockets can also be processed; 3. some library functions can also be used. For example, some Unicode or regular expression re can be used to process them.
You can try the following statement:
IoList = <<"Hello World">>, io:format("~s~n", [IoList]).
In general:Replace string with binaries to avoid dynamic garbage creation when the data structure changes.
TCP and UDP: Bro-tocols
UDP protocol is the first socket we will talk about. Based on UDP protocol, UDP is based on the Protocol IP layer and provides a few abstract interfaces similar to ports. UDP is considered to be a protocol that lacks maintenance status. The data obtained from the UDP port is divided into several small blocks, with no mark and no session ID. It cannot guarantee that the data you receive is consistent with the data sent.
In reality, when someone sends a packet to the receiver, the receiver may not be able to receive the packet. Due to the above reasons, people tend to be very small in packets, and it doesn't matter if a small part of packets is lost, there are not many complex data exchange scenarios using UDP (the most typical is video download)
Compared with UDP, there is also a TCP protocol that ensures the delivery mechanism. TCP will be responsible for handling discarded packages, resending them, and use an independent session to ensure multiple-to-send and receiver.
To ensure reliable information transmission, TCP has to sacrifice efficiency and become slower than UDP transmission and data redundancy is large. UDP is fast and unreliable; so select which one to use based on your specific scenario.
In any case, it is very easy to use UDP in Erlang. We only need to specify the port and set a socket, then this socket can send and receive data;
For a bad analogy, this is like having a bunch of mailboxes on your house (each mailbox being a port) and processing tiny slips of paper in each of them with small messages. they can have any content, from "I like how you look in these pants" down to "the slip is coming fromInsideThe house! ". When some messages are too large for a slip of paper, then blocks of them are dropped in the mailbox. it's your job to reassemble them in a way that makes sense, then drive up to some house, And Drop slips after that as a reply. if the messages are purely informative ("Hey there, your door is unlocked") or very tiny ("what are you wearing? -Ron "), it shocould be fine and you cocould use one mailbox for all of the queries. if they were to be complex, though, we might want to use one port per session, right? Ugh, no! Use TCP!
A less appropriate analogy: Like a house with a lot of mailboxes (equivalent to ports) on the side of it, each port can be used to receive many small-size emails that can carry any content, such: "I like the way you look at your pants", "this piece of paper comes from inside the house! "... If some messages are too large, they will be cut into multiple letters and put them in the mailbox. Your responsibility is to use meaningful rules to allocate them and put them in the corresponding house as a response. If a piece of information is very useful (Hey, your door is unlocked) or very small (Ron, what do you wear today), you can use a mailbox to process all the information, however, if the information is very complex, we may need to use a dedicated port for each group of sessions, which is obviously ugly! In this case, you should use TCP.
TCP is a stateful connection-based protocol. before sending a message, you must shake hands first, which means that when someone wants to send a letter to the corresponding mailbox, they must first say "hey, I am an IP Address: 94.25.12.37, can you chat?" Then you reply, "of course, add a label to your message with the number N, and the number N is incremented for each sentence ". after that, if you want to communicate with ip92.25.12.37, you may send the content that has not been received by the other party.
In this way, we can use a mailbox (port) to maintain communication, which is very elegant for TCP, although it will increase the burden, this ensures that all information is sequentially allocated to the corresponding place.
If you are not interested in the above analysis and don't be too disappointed, we will immediately discuss how to use TCP and UDP scokets in Erlang, which is very simple.
UDP sockets
Some basic UDP operations are as follows:
1. Establish a socket (setting up a socket); 2. Send message (sending mesasge); 3. Receive message; 4. Close the connection (closeing a connection );
The first operation, no matter what you want to do, you must first usegen_udp:open/1-2
Open a socket. The simplest example is as follows:
{ok, Socket} = gen_udp:open(PortNumber).
Portnumer value range: 1 ~ 65535:
% 0 ~ 1023 is regarded as a system port. In most cases, your operating system will not allow you to listen to the system port unless you grant the corresponding permissions; % 1024 ~ 49151 is the registration port, which is usually not restricted and can be used freely; (sometimes not usually because registered to well known services); % 49152 ~ 65535 is a dynamic or private port, which is usually used for short-term use (ephemeral ports );
The following experiment uses some ports that are not used for security, such as 8789.
But before that, you must note that if you useGen_udp: Open/2. The second parameter here is a list of options. For example, it specifies whether the data we want to receive is list or binary. The receiving method is {active, true} or {active, false };
There are also some options to specify that the connection network type is IPv4 or IPv6; Can UDP socekt be used for broadcast ({broadcast, true | false}), buffer size, and so on;
There are a lot of useful options to use, but we only use a simple connection for this time, the other part you want to learn TCP, UDP and then try again.
Therefore, we can open a socket in eralng shell:
1> {ok, Socket} = gen_udp:open(8789, [binary, {active,true}]).{ok,#Port<0.676>}2> gen_udp:open(8789, [binary, {active,true}]).{error,eaddrinuse}
The first command opened port 8789 in the form of receiving binary, {active, true}. You can see the returned value of # port <0.676>, this indicates the port we just opened. You can use it like using PID. You can even link them with the process to facilitate the corresponding process to handle the sockect crash;
The second command also wants to open the same port 8789 again, because the port opened by the First Command has not been released, and an error {error, eaddiinuse} is returned. This address is already in use.
In any case, we open another Erlang shell terminal and use a different port to find the second UDP socket:
1> {ok, Socket} = gen_udp:open(8790).{ok,#Port<0.587>}2> gen_udp:send(Socket, {127,0,0,1}, 8789, "hey there!").ok
Haha, you can see a new function: gen_udp: Send/4. From the name, you can see that it is used to send information. parameter:
gen_udp:send(OwnSocket, RemoteAddress, RemotePort, Message).
Remoteaddress can be a string, an atomic structure (including domain names such as "example.org"), an IPv4 (4-element number tuples), or IPv6 (8-element number tuples );
Remoteport is the corresponding receiving port,
Message is the information itself, which can be a string, binary, or an IO list.
After the message is sent, can you refresh it in your first shell:
3> flush().Shell got {udp,#Port<0.676>,{127,0,0,1},8790,<<"hey there!">>}ok
Wonderful! Two shells can communicate with each other... The process that opens the socket receives a message in the following format:
{udp, Socket, FromIp, FromPort, Message}.
This message contains the information from and the content, so we can use the active module to send and receive information.
So what is the passive mode? To illustrate the passive mode, we need to close the socket in the first shell and then open a new socket Port:
4> gen_udp:close(Socket).ok5> f(Socket).ok6> {ok, Socket} = gen_udp:open(8789, [binary, {active,false}]).{ok,#Port<0.683>}
On the above page, we closed the socket, unbound the socket variable, and then bound the socket to the passive mode. Then, try the following command:
7> gen_udp:recv(Socket, 0).
At this time, your shell will be blocked. Recv/2 is a function used to wait for a passive socket to send information. 0 indicates the length of the message we want to accept. Interestingly, this length is ignored (useless) for gen_udp. gen_tcp also has a similar function where this parameter takes effect.
No matter what, if we do not send a message to this process, Recv/2 will always wait for no response;
Now let's send a new message to the first process in the second shell:
3> gen_udp:send(Socket, {127,0,0,1}, 8789, "hey there!").ok
The second shell is printed
{ok,{{127,0,0,1},8790,<<"hey there!">>}}
If you don't want to block it all the time, you can use
8> gen_udp:recv(Socket, 0, 2000).{error,timeout}
The above is almost the majority of UDP content, there is really only so much, don't lie to you!
TCP sockets
Most interfaces of TCP sockets are similar to UDP sockets, but their working principle is very different. The biggest one is that the client and server are two completely different things, the behavior of a client is as follows:
The server is in the following mode:
Huh ?, Does it look weird? The client performs a bit like gen_udp: you connect to a port, send and receive information, and then close the socket;
For the server side, we add a new mode: listening. This is because TCP works by establishing a session.
First, we need to open a new shell terminal to usegen_tcp:listen(Port, Options),
Listener:
1> {ok, ListenSocket} = gen_tcp:listen(8091, [{active,true}, binary]).{ok,#Port<0.661>}
The listening port is responsible for managing the requests connected by connection. You can see that we use parameters similar to those above gen_udp. In fact, most of the options are similar to IP sockets, TCP has some special options:
A connection backlog ({backlog, N}
), Keepalive Sockets ({keepalive, true | false}
), Packet packaging ({packet, N}
,N indicates that each packet header is automatically removed during parsing ......
Once the listening socket is opened, any process can use this port to enter an accepting status, blocking it until a client comes up and says something to it.
2> {ok, AcceptSocket} = gen_tcp:accept(ListenSocket, 2000).** exception error: no match of right hand side value {error,timeout}3> {ok, AcceptSocket} = gen_tcp:accept(ListenSocket).** exception error: no match of right hand side value {error,closed}
Wow, we crashed after timeout, And the socket for this listener was closed, so we have to try again! Set the timeout value to 2 S (2000 ms) this time)
4> f().ok5> {ok, ListenSocket} = gen_tcp:listen(8091, [{active, true}, binary]).{ok,#Port<0.728>}6> {ok, AcceptSocket} = gen_tcp:accept(ListenSocket).
This process is blocked and waiting for communication. Very good! Then we open another shell terminal:
1> {ok, Socket} = gen_tcp:connect({127,0,0,1}, 8091, [binary, {active,true}]).{ok,#Port<0.596>}
This is the same as the gen_udp option above. The timeout option is added at the end. If you do not add the option, you will wait until you look back at the first shell. The returned result is {OK, socketnumber }. after that, the accept socket and the client socket can communicate one-to-one, similar to gen_udp. use the second shell to send a message to the first shell:
3> gen_tcp:send(Socket, "Hey there first shell!").ok
You can see on the first shell terminal:
7> flush().Shell got {tcp,#Port<0.729>,<<"Hey there first shell!">>}ok
Sockets on both sides can send messages in the same way, or use gen_tcp: Close (socket) to close the socket. however, you must note that disabling an accept socket will only close its own socket; closing a listen socket will close itself and the accept sockets that establish a connection with it, if the connection is closed, {error, closed} will be returned }.
This is the majority of TCP sockets in Erlang. Do you believe this?
Although most of the above content is still worth noting, if you have personally experienced these sockets, you may find that there is a question about the ETS ownership.
By this, I mean that UDP sockets, tcp client sockets and TCP accept sockets can all have messages sent through them from any process in existence, but messages received can only be read by the process that started the socket:
I also want to explain that UDP sockets, tcp client, and TCP accept sockets can be used by any process to send messages, but the process that starts receiving messages can only receive the messages:
Is this not practical? This means that we must keep the process of starting sockets. Can we be more flexible?
1. Process A starts a socket 2. Process A sends a request 3. Process A spawns process B with a socket 4a. Gives ownership of the 4b. Process B handles the request socket to Process B 5a. Process A sends a request 5b. Process B Keeps handling the request 6a. Process A spawns process C 6b. ... with a socket ...
Process A is responsible for the management (running a bunch of queries), but each newly started process is responsible for waiting for a response to process the corresponding message. A is wise to assign a task to a new process! The key lies in how to transfer the socket ownership (ownership) to the new process:
Both gen_tcp and gen_udp can use controlling_process (socket, pid) to transfer the ownership of sokcet. Calling this function will tell Erlang: "I don't want to worry about this socket anymore. Let's manage this socket for the PID. I will quit !"
After that, the PID can use this socket to send and receive information! That's it.
Next, I will talk about Inet and something,
When I saw this, I always thought that the original English was interesting and vivid. After translation, it was hard and scum, and I had no courage to spoil it. Therefore, we strongly recommend that you read the original article:Http://learnyousomeerlang.com/buckets-of-sockets
Oh NO~~~~~~~~~~~~~
Doraemon has always had a camera shot: danxiong went to school or set off to various places. Doraemon stood at the door and waved his smile and said, "I will wait for your good news.