The socket server used in the cloud platform is a socket service that we define a set of communication protocols and implement through C #.
The service is currently running in the IIS container with the Web service and listens for ports by starting a new thread that never quits.
In the early stages of development, the listener thread exited unexpectedly because the exception of some messages in the service was not captured, such as the malformed message sent by the client, an attempt to close an already freed connection, and so on.
Later with the use of the system these problems were one by one fixed, the socket service is stable a lot, but after more than a week, the socket service will occasionally hang up, viewing the system log did not find any system exceptions. On the Internet to check some information about IIS, found that IIS has a set of intelligent process recovery mechanism, in order to improve the performance of the server, the memory session, cache and running threads will be cleared when the process is recycled, so use IIS as the server, to ensure that the session, Caches and other resources are available for a long time, put them in the database, or distributed to other servers to save. After the process is reclaimed, IIS starts a new thread, and the ports that were originally deployed in IIS will be re-monitored, but those threads that were started by the user before IIS are not started.
Someone on the web has given a solution to configure the IIS7:
Recycle--Fixed time interval (minutes) changed to 0
--Virtual/private memory limit (KB) changed to 0
Process Model-idle timeout (minutes) changed to 0
This approach disables process recycling for IIS, but this can cause performance degradation for long-running servers. Also, after many attempts to configure this, IIS will recycle the process after a long run.
It is thought that IIS will reboot the port of the site on which it is running after the process is reclaimed, and we can run a service ourselves to determine whether the thread of the socket server is running normally, or restart the service if it is not normal. This service must be running outside of IIS.
The specific approach is:
Web Service provides an interface to get the status of a process
/socketserver.ashx?action=getthreadstatus
Provides an interface to restart the socket service
/socketserver.ashx?action=startsocketserver
A service is started outside of IIS by another method that accesses the interface that gets the state of the process every 10 seconds and, if not normal, calls the interface that restarts the socket service.
The current practice is to start a Nodejs service:
//This service is used to monitor the socket service process of the cloud platform, restarting the socket service, WS service, Task timeout detection if the process crashes or restartsvarHttp=require (' http ');varmoment = require (' moment '))//var host= "http://xxx";//Local debuggingvarHost= "Http://xxxxxx";//Intranet Services//var host= "http://xxxx";//public Network Servicevarstatuscheck= "XXX";varstartsocket= "XXX" ;varstartws= "XXX" ;vartasktimeout= "XXX";varInteval;functionstart () {Inteval= SetInterval (CheckStatus, 20000);}functionEnd () {clearinterval (inteval);} Start ();functioncheckstatus () {Try{http.get (host+ StatusCheck,function(res) {Res.on (' Data ',function(data) {varSocketstatus =Json.parse (data.tostring ()); if(Socketstatus.socketserver = = ' Hung ' | | socketstatus.socketserver = = ' Stopped ')) {Console.log (Moment (NewDate ()). Format (' Yyyy-mm-dd HH:mm:ss ') + "socket Service Unavailable, restarting") //Restart ServiceRestartservice (); })}). On (' Error ',function(e) {Console.log (Moment (NewDate ()). Format (' Yyyy-mm-dd HH:mm:ss ') + "error:" +e.message); }); } Catch(e) {console.log (e.message); }}functionRestartservice () {//end ();Http.get (host + Startsocket,function(res) {StatusCode (Res.statuscode,' Startsocket '); Console.log (Moment (NewDate ()). Format (' Yyyy-mm-dd HH:mm:ss ') + "Restart Socketserver" +Res.statuscode); Res.resume (); }); Http.get (Host+ Startws,function(res) {StatusCode (Res.statuscode,' Startws '); Console.log (Moment (NewDate ()). Format (' Yyyy-mm-dd HH:mm:ss ') + "Restart WSServer" +Res.statuscode); Res.resume (); }); Http.get (Host+ Tasktimeout,function(res) {StatusCode (Res.statuscode,' Tasktimeout '); Console.log (Moment (NewDate ()). Format (' Yyyy-mm-dd HH:mm:ss ') + "Restart task status monitoring" +Res.statuscode); Res.resume (); }); varStatus = {startsocket:false, Startws:false, Tasktimeout:false }; functionStatusCode (code, name) {if(Code = = 200) {Status[name]=true; } if(Status.startsocket && Status.startws &&status.tasktimeout) {//start (); } }}
This approach currently has two drawbacks:
1. Each time the IIS process is recycled, the socket service will be unavailable for a few seconds
2.socket service running in the Web server, not conducive to the future expansion of the Web server or socket server, the device connected to a server can not be accessed by B server
The future direction of improvement is:
Separate the socket server and redesign the communication method between the Web server and the socket server.
This allows the socket service to be unaffected by the configuration of the IIS server and optionally extends the Web server with the socket server.
Ensure stable operation of the socket server by monitoring thread status