In many cases, the game server cannot avoid broadcasting messages to gamers. The problem I encountered was that this broadcast operation was handled by a process to broadcast gamers one by one, this process can easily get stuck (especially when the network is poor or the number of players is too large ).
Most of the queries are stuck on fun prim_inet: Send/3. This is because the previous broadcast directly called fun gen_tcp: Send/2, in fact, this is a synchronous call, and you can know how it is done by stripping code at a layer.
First, this is the source code of fun gen_tcp: Send/2:
1 send(S, Packet) when is_port(S) ->2 case inet_db:lookup_socket(S) of3 {ok, Mod} ->4 Mod:send(S, Packet);5 Error ->6 Error7 end.
Here we will call fun inet_tcp: Send/2, so let's look at its source code:
1 send(Socket, Packet, Opts) -> prim_inet:send(Socket, Packet, Opts).2 send(Socket, Packet) -> prim_inet:send(Socket, Packet, []).
In fact, we actually called fun prim_inet: Send/3, which is the function that gets stuck. Let's look at its source code:
1 send(S, Data, OptList) when is_port(S), is_list(OptList) -> 2 ?DBG_FORMAT("prim_inet:send(~p, ~p)~n", [S,Data]), 3 try erlang:port_command(S, Data, OptList) of 4 false -> % Port busy and nosuspend option passed 5 ?DBG_FORMAT("prim_inet:send() -> {error,busy}~n", []), 6 {error,busy}; 7 true -> 8 receive 9 {inet_reply,S,Status} ->10 ?DBG_FORMAT("prim_inet:send() -> ~p~n", [Status]),11 Status12 end13 catch14 error:_Error ->15 ?DBG_FORMAT("prim_inet:send() -> {error,einval}~n", []),16 {error,einval}17 end.
It can be seen that after calling fun Erlang: port_command/3, it will call a receive to receive the inet_reply message. Of course, we can see that if the nosuspend parameter is input when fun Erlang: port_command/3 is called, The receive will no longer be called when the port is busy. It seems that the problem can be solved in this way, but it is still possible to get stuck in the end, although the chances are much lower. It may be fun Erlang: port_command/3, which is its source code:
1 port_command(Port, Data, Flags) ->2 case case erts_internal:port_command(Port, Data, Flags) of3 Ref when erlang:is_reference(Ref) -> receive {Ref, Res} -> Res end;4 Res -> Res5 end of6 Bool when Bool == true; Bool == false -> Bool;7 Error -> erlang:error(Error, [Port, Data, Flags])8 end.
It is not hard to see that there is also a synchronous call here, that is, when the result returned after fun erts_internal: port_command/3 is a ref, it is called receive. Well, Will fun erts_internal: port_command/3 still get stuck? This is only implemented in the C language:
1 BIF_RETTYPE erts_internal_port_command_3(BIF_ALIST_3) 2 { 3 BIF_RETTYPE res; 4 Port *prt; 5 int flags = 0; 6 Eterm ref; 7 8 if (is_not_nil(BIF_ARG_3)) { 9 Eterm l = BIF_ARG_3;10 while (is_list(l)) {11 Eterm* cons = list_val(l);12 Eterm car = CAR(cons);13 if (car == am_force)14 flags |= ERTS_PORT_SIG_FLG_FORCE;15 else if (car == am_nosuspend)16 flags |= ERTS_PORT_SIG_FLG_NOSUSPEND;17 else18 BIF_RET(am_badarg);19 l = CDR(cons);20 }21 if (!is_nil(l))22 BIF_RET(am_badarg);23 }24 25 prt = sig_lookup_port(BIF_P, BIF_ARG_1);26 if (!prt)27 BIF_RET(am_badarg);28 29 if (flags & ERTS_PORT_SIG_FLG_FORCE) {30 if (!(prt->drv_ptr->flags & ERL_DRV_FLAG_SOFT_BUSY))31 BIF_RET(am_notsup);32 }33 34 #ifdef DEBUG35 ref = NIL;36 #endif37 38 switch (erts_port_output(BIF_P, flags, prt, prt->common.id, BIF_ARG_2, &ref)) {39 case ERTS_PORT_OP_CALLER_EXIT:40 case ERTS_PORT_OP_BADARG:41 case ERTS_PORT_OP_DROPPED:42 ERTS_BIF_PREP_RET(res, am_badarg);43 break;44 case ERTS_PORT_OP_BUSY:45 ASSERT(!(flags & ERTS_PORT_SIG_FLG_FORCE));46 if (flags & ERTS_PORT_SIG_FLG_NOSUSPEND)47 ERTS_BIF_PREP_RET(res, am_false);48 else {49 erts_suspend(BIF_P, ERTS_PROC_LOCK_MAIN, prt);50 ERTS_BIF_PREP_YIELD3(res, bif_export[BIF_erts_internal_port_command_3],51 BIF_P, BIF_ARG_1, BIF_ARG_2, BIF_ARG_3);52 }53 break;54 case ERTS_PORT_OP_BUSY_SCHEDULED:55 ASSERT(!(flags & ERTS_PORT_SIG_FLG_FORCE));56 /* Fall through... */57 case ERTS_PORT_OP_SCHEDULED:58 ASSERT(is_internal_ref(ref));59 ERTS_BIF_PREP_RET(res, ref);60 break;61 case ERTS_PORT_OP_DONE:62 ERTS_BIF_PREP_RET(res, am_true);63 break;64 default:65 ERTS_INTERNAL_ERROR("Unexpected erts_port_output() result");66 break;67 }68 69 if (ERTS_PROC_IS_EXITING(BIF_P)) {70 KILL_CATCHES(BIF_P); /* Must exit */71 ERTS_BIF_PREP_ERROR(res, BIF_P, EXC_ERROR);72 }73 74 return res;75 }
From this code, we can see that if the nosuspend parameter is input when fun erts_internal: port_command/3 is called, this function will not be stuck.
Let's look back at the broadcast messages in the game. This often does not require much reliability assurance. In case it is not sent to players, it is better to kill my process. Therefore, when broadcasting messages, you can change fun gen_tcp: Send/2 to the underlying fun erts_internal: port_command/3 and pass in the nosuspend parameter, this ensures smooth broadcast. However, the call to fun erts_internal: port_command/3 may cause the process to receive messages such as {inet_reply, S, status} and {ref, Res, it needs to be properly processed.
Erlang drip-when the broadcast gets stuck