Socket-based chat room implementation principle

Source: Internet
Author: User

Socket-based chat rooms are still rare, and domestic well-known chat rooms are easy to use and bihai yinsha. This chat room has obvious characteristics. Unlike CGI chat rooms, it will be refreshed on a regular basis no matter whether or not anyone speaks. When someone speaks, new chat content appears on the screen, and the chat content keeps rolling up. If the browser status bar is located, you can see that the progress bar is always on the download page. This kind of chat room can accommodate many people, but its performance will not be significantly reduced. For example, Netease chat rooms often have hundreds of people chatting on one server. This method is different from the cgi chat room where the client browser requests chat content regularly, but the chat server software sends messages to the client browser.
The basic principle of the socket chat room is to imitate the response of the WWW server and send the chat content back to the browser after receiving the browser request based on the HTML specification. In the browser's view, it is always in the page connection status like browsing a huge page. It is actually a dedicated chat server, a simplified WWW server.
In this way, the advantages of socket chat rooms are obvious compared to CGI:
1. No special WWW server is required. Complete necessary work on the chat server to avoid the time-consuming CGI process.
2. If you use a single-process server, you do not need to generate a new process each time.
3. Data Exchange is fully performed in the memory without reading/writing files.
4. You do not need to refresh regularly to reduce the flickering screen and the number of requests to the server.

Before discussing the specific process, let's first discuss some related technologies:

HTTP request and Response Process
HTTP protocol is the standard for communication between browsers and WWW servers. As a simplified WWW server, socket chat servers should abide by this Protocol. In fact, you only need to implement a small part.
HTTP uses the client server mode, where the browser is an HTTP client. When you browse a page, you actually open a connection and send a request to the WWW server. The server sends a response to the browser based on the requested resource, close the connection. The request and response between the customer and the server have certain format requirements. As long as the request is received and sent in this format, the browser can be "Spoofed" so that it is communicating with the WWW server.
The request and response have a similar structure, including:
· An initial line
· Zero or multiple headers lines
· A blank line
· Optional information
Let's look at the request sent by a browser:
When we browse the webpage: browse:
GET/path/file.html HTTP/1.0
From: someuser@somehost.com
User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Windows NT 4.0; digext)
[Empty rows]

The first line of get/path/file.html HTTP/1.0 is the core of our processing. It consists of three parts separated by spaces: Method: Get, request Resource:/path/file.html, HTTP Version: HTTP/1.0.

The server will respond with the following information through the same socket:
HTTP/1.0 200 OK
Date: Fri, 31 Dec 1999 23:59:59 GMT
Content-Type: text/html
Content-Length: 1354

<HTML>
<Body>
<H1> Hello world! </H1>
(Other content)
.
.
.
</Body>
</Html>
The first line also contains three parts: HTTP Version, status code, and description related to the status code. Status Code 200 indicates that the request is successful.
After an answer is sent, the server closes the socket.
The above process can be simulated using telnet www.somehost.com: 80.

Cyclic server and Concurrent Server
A cyclic Server is a server that can process only one request at a time. When multiple requests arrive at the same time, they are placed in the Request queue.
The concurrent server generates a new process for processing after each request arrives.
The concurrent serverAlgorithmHowever, it has inherent advantages in fast response, and when a user or server communication deadlock does not affect other processes, but because multiple processes need to communicate with each other to achieve information exchange, in addition, the overhead of the new fork process increases as the number of users increases, so the concurrent server may not be the best choice in some cases.
Although the cyclic server may produce latency on the surface, the process of processing each request in a system such as a chat room is very short. For customers, it can achieve the same effect as the concurrent server, because it is a single-process server, there is no need for inter-process communication, no need for new fork processes, simple programming, and very little consumption of system resources. But because it is a single process, the deadlock between a customer and the server will lead to the entire system deadlock.

Post and get
There are two common types of form information to be submitted: Post & get. Because the length of post is not limited, it is used as a method for most form submissions. The get method sends the submitted information through the URL. Because the URL can be up to 1024 bytes, this method cannot be used if the message is sent for a long time. Because the chat content is limited in length and does not take a long time, and because the get method is used for common page browsing, using the get method to submit a form can simplify the process.

Use the Perl module to implement socket communication
Assume that you have some knowledge about socket programming. If you have used socket programming in C language, it is very easy to understand socket programming in Perl. If you are not familiar with the socket, see the socket reference provided in this Article.
Use Perl to write socketProgramYou can use socket or use IO: socket. The former method is similar to the C language, while the latter method encapsulates objects, which makes it easier to write and maintain them.

When we use a single-process cyclic server to implement concurrent services, the basic idea is to allow multiple clients to open the socket connection to the server. The server uses some methods to monitor which sockets have data arriving, and process the connection. There is a key question in this idea: How does the server trigger data processing? If you understand C language socket programming, you will know that there is a system function select that can complete this operation. However, due to the bit operation, the Perl language processing is not very clear, but if you use the module IO :: select is easy.
Let's look at an example of the help from Io: select:
Use IO: select;
Use IO: socket;

$ Lsn = new IO: Socket: iNet (Listen = & gt; 1, localport = & gt; 8080 );
# Create a socket and listen on port 8080, which is equivalent to using system functions
# Socket (), BIND (), listen ()

$ Sel = new IO: Select ($ LSN );
# Create a select object and add the previously created socket object

While (@ ready = $ sel-> can_read) {# process each readable socket
Foreach $ FH (@ ready ){
If ($ FH = $ LSN ){
# If the originally created socket is readable, a new connection is established.
# Create a new socket and add select
$ New = $ lsn-> accept;
$ Sel-> Add ($ new );
}
Else {
# For other sockets, read data and process data
......
# After the processing is complete, delete the socket from the SELECT statement and then close the socket.
$ Sel-> remove ($ FH );
$ FH-> close;
}
}
}
IO: Basic socket operations,
Create a socket object: $ socket = new IO: Socket: iNet ();
Receive client connection requests: $ new_socket = $ socket-> accept;
Send data through socket: $ socket-> send ($ message );
Receive data from a socket: $ socket-> Recv ($ Buf, length );
Close socket connection: $ socket-> close;
Determine whether the socket is open: $ socket-> opened;

IO: Basic select operations
Create a select object: $ select = new IO: Select ();
Add socket to select: $ select-> Add ($ new_socket );
Delete socket from select: $ select-> remove ($ old_socket );
Find the readable socket from select: @ readable = $ select-> can_read;
Find all sockets in the SELECT statement: @ sockets = $ select-> handles;

Daemon Implementation Method
To implement a background process, you need to complete a series of tasks, including
· Close all file descriptions
· Change the current working directory
· Reset the umask)
· Execution in the background
· Remove from Process Group
· Ignore terminal I/O signals
· Disconnections from control terminals
These operations can be simplified using the Perl module:
Use proc: Daemon;
Proc: daemon: Init;

Pipe Signal Processing
If the customer closes the socket, the server continues to send data, it will generate pipe signal. If not processed, it will lead to unexpected server interruption. To avoid this situation, we must process it. In general, we only need to simply ignore this signal.
$ Sig {'pipele'} = 'ignore ';

Unexpected handling
In the socket communication process, exceptions may occur. If data is directly sent without processing, the program may exit unexpectedly. The eval function in Perl can be used for unexpected processing. For example:
If (! Defined (eval {operation statement ;})){
Handle errors;
}
In this way, when an operation statement in Eval is incorrect, such as die, only the eval statement is aborted and the main program is not interrupted.

User disconnection judgment and handling
In many cases, the user does not leave the chat room by submitting the "leave" button. In this case, you need to determine whether the user is disconnected. The method is: when the user closes the browser, clicks the browser stop button, or jumps to another webpage, the corresponding socket will become readable, the data read at this time is a null string.
With this principle, as long as a readable socket reads data but reads NULL data, we can conclude that the user corresponding to this socket is disconnected.

Prevent user disconnection
If the browser does not receive any data within a period of time, a timeout error occurs. To avoid this error, you must send some data at a certain interval. In our application system, you can send some HTML comments. The process of sending comments can be completed by refreshing the online list.

Let's take a look at the specific implementation process:
Chat Server implementation process
· Server side
Is the NS box graph program process:

The "processing user input" section in can be subdivided:

 

User data input is transmitted through URL. The following are URL instances that can better understand the system structure based on the subsequent client processes:
This is a chat user with both username and password 'aaa' logging on to the system, saying "hello", and then exiting a series of requests. The password is encrypted using the system function crypt:
/Login? Name = aaa & passwd = pjhiieleipsee
/Chat? SID = zuyphh3twhenksicnjov & passwd = pjhiieleipsee
/Talk? SID = zuyphh3twhenksicnjov & passwd = pjhiieleipsee
/Names? SID = zuyphh3twhenksicnjov
/Dotalk? SID = zuyphh3twhenksicnjov & passwd = pjhiieleipsee & message = Hello
/Leave? SID = zuyphh3twhenksicnjov & passwd = pjhiieleipsee

The above is the server program process. Next we will look at the specific logon process from the client.
Let's first look at the chat interface:

The chat interface consists of three frames, the chat frame is the chat content display part, and the talk frame is the user input part, including the chat content input, action, filtering, and management functions are all input in this frame; names is the online list display part, which is regularly refreshed.

let's look at the process of entering the chat room from the browser's perspective.
· browser request page
http: // host: 9148/login? Name = Name & passwd = PWD
A socket connection is generated to connect to the chat port on the server, and a row of data is sent:
Get/login? Name = Name & passwd = pwd http/1.1
· the server generates a session ID, verifies the password, and sends it back:
HTTP/1.1 200 OK

Content-Type: text/html


......







......

The server then closes the socket connection.

· after receiving the preceding HTML file, the browser opens three connections in sequence (where $ Sid and $ encrypt_pass are variables):
/chat? SID = $ Sid & passwd = $ encrypt_pass/talk? SID = $ Sid & passwd = $ encrypt_pass
/names? SID = $ Sid
the first Chat link in the three connections maintains the connection throughout the chat process. From the browser perspective, it is a large page that can never be downloaded. The display effect is that the chat content is continuously rolled up instead of refreshing. By looking at the HTML Code , you can see that only is followed by the ever-increasing chat content, no .
after the other two links are sent on the page, the socket is closed.
in this way, there are actually four socket connections in the chat room, but after the login is complete, only the socket of the chat frame is connected and used to receive chat information from the server, this is the key to socket chat rooms.
the server stores the chat sockets of all chat clients. When someone speaks, the server sends the chat content to all chat sockets.
the HTML of the talk and names frames is actually the same as that of the common form.

· After a user logs on, the server saves a table containing user information.
In Perl implementation, we use a hash structure to store information and use session ID as the key index. Such a storage structure facilitates data access and space recovery. Each customer information is an array:
[Socket, name, passwd, privilige, filter, login_time, color]
Socket: socket connection for storing chat Frames
Name: User Name
Passwd: Password
Privilige: Permission
Filter: reference of a user's filtering list)
Login_time: records the logon time so that timeout connections can be cleared later.
Color: User chat color
Most of the above user data is in the login phase and is entered after the user passes password verification. Only the chat socket can be obtained after the chat frame is displayed. If the socket is not filled in after a certain period of time, it indicates that the connection is interrupted after the browser obtains the main framework. In this case, the user data needs to be deleted.

The above is the core part of the chat room. Other parts, such as user registration and password change, can follow the CGI chat room code.

Areas for improvement
Currently, basic chat functions such as chat, whisper, and action are provided, and additional functions such as user name list filtering are provided. The management function completes kicking, querying IP addresses, and assigning room owners. Areas to be improved in the future include:
Stability: currently, the chat room has not been tested by a large number of users, and the stability cannot be fully guaranteed. Because it is a single-process cyclic server, a user's communication deadlock will lead to all deadlocks. If a concurrent multi-process server is used, the stability can be improved. However, such a system also consumes a lot of server resources.
Features: self-built chat rooms and other functions have not yet been completed. These peripheral functions can be easily added after stability is ensured.

[Reference content]
1. the original structure of the chat room described in this article comes from entropy chat 2.0 (http://missinglink.darkorb.net/pub/entropychat/). Without its inspiration, it would be difficult to complete this system, thanks very much for their hard work, if you are willing to improve this program, you can download it at http://tucows.qz.fj.cn/chatSource code.

2. For the basic interaction process of HTTP, see
HTTP made really easy (http://www.jmarshall.com/easy/http/), rfc1945: Hypertext Transfer Protocol -- HTTP/1.0

3. The Perl modules mentioned in this Article can all be found in the http://tucows.qz.fj.cn, please use the search function at the top of the page to search.
IO: socket and IO: selectis the perlstandard release. You can also obtain it by installing io-1.20.tar.gz.
Proc: daemon is installed externally. The value is proc-daemon-0.02.tar.gz.
The version numbers of the above modules may vary. You only need to enter some keywords such as "daemon" during search.

4. In order to speed up the development process, part of the interface of the program referred to the Netease chat room (http://chat.163.net/), many of the program's ideas also come from their work.

5. How to Write a chat server can be used as a good reference
Http://hotwired.lycos.com/webmonkey/97/18/index2a.html

6. To test the chat room function, go to http://tucows.qz.fj.cn/chat;
7. socket programming reference
UNIX socket FAQ (http://www.ntua.gr/sock-faq)
· Beejs guide to Network Programming
Http://www.ecst.csuchico.edu /~ Beej/GUIDE/NET/

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.