1. Protocol
A. Overview of TCP/IP architecture
The TCP/IP protocol does not fully comply with the OSI Layer-7 Reference Model. The traditional Open System Interconnection Reference Model is a layer-7 abstract reference model of communication protocols, where each layer executes a specific task. This model aims to make various hardware communicate with each other at the same level. These seven layers are: physical layer, data link layer, network layer, transmission layer, voice layer, presentation layer and application layer. The TCP/IP communication protocol uses a layer-4 hierarchy. Each layer calls the network provided by its next layer to meet its own needs. The four layers are:
I. Application Layer: The layer for communication between applications, such as hypertext transfer protocol HTTP), simple Email transmission SMTP), file transfer protocol FTP, and network remote access protocol Telnet.
Ii. transport Layer: In this layer, it provides data transmission services between nodes, such as transmission control protocol (TCP) and User Datagram Protocol (UDP, TCP and UDP add transmitted data to the data packet and transmit it to the next layer. This layer is responsible for transmitting data and confirming that the data has been delivered and received.
Iii. Interconnection Network Layer: responsible for providing basic data packet transfer functions so that each packet can reach the target host without checking whether it is correctly received), such as Internet Protocol IP ).
Iv. Network Interface Layer: manages the actual network media and defines how to transmit data using the actual network such as Ethernet and Serial Line.
B. HTTP Protocol Introduction:
I. HTTP is a HyperText Transfer Protocol. It is a set of rules for computers to communicate over the network. In the TCP/IP architecture, HTTP belongs to the application layer protocol and is located at the top layer of the TCP/IP protocol.
Ii. HTTP is a stateless protocol, meaning that a persistent connection is not required between the Web browser client and the Web server. The whole process is when a client sends a request to the server, then the Web server returns a response, and then the connection is closed, the connection information is not retained on the server.
Iii. HTTP follows the request/response model, and all communication interactions are constructed in a set of request and response models.
Iv. When browsing the WEB, the browser exchanges information with the WEB server through the HTTP protocol. The files returned by the Web server to the Web browser all have the relevant types. The formats of these information types are defined by MIME.
C. java implementation of the Protocol
Both TCP/IP and HTTP protocols, java uses Socket java.net. socket), you can refer to my other technical blog: a project to see java TCP/IP Socket programming (1.3)
2. Principles of HTTP message interface and client-to-server interaction
A. The reason for the HTTP-defined service department consists of the following four steps:
I. Establish a connection:
For example, if I enter a http://cuishen.iteye.com in a browser, the client opens a socket for the web Server HTTP port when requesting this address. Because the physical media used to transmit data in the middle of the network is the network cable, and data is essentially output and input through IO streams, it is not difficult to understand why we reference import java when writing a Servlet. io. *; cause: we need to use the println () method of the PrintWriter object when sending the result back to the client. In fact, the requested address must contain port 80 and port 80, because the default port number of the browser is 80.
This is implemented in the underlying Java code, but they have already helped us.
- Socket socket = new Socket("cuishen.iteye.com",80);
- InputStream in = socket.getInputStream();
- OutputStream out = socket.getOutputStream();
Ii. The client sends an HTTP request)
Once a TCP connection is established, the Web browser sends a request command to the Web server, which is an ASCII text request line followed by 0 or more HTTP header labels, an empty row and any data that implements the request.
The packet is divided into four parts: request line, request header mark, blank line and request data
1) Request Line
A request line consists of three tags: Request Method, request URL, and HTTP Version, separated by spaces.
For example: GET cuishen.iteye.com/blog/242842 HTTP/1.1
The HTTP specification defines eight possible request methods: (the most common methods are GET and POST)
- GET -- retrieve a simple request that identifies a resource in a URI
- HEAD -- same as the GET method, the server only returns the status line and header label, and does not return the Request Document
- POST -- the Server accepts the requests for data written into the output stream of the client.
- PUT -- the server saves the request data as a request to specify the new content of the URI
- DELETE -- Request for the server to DELETE the resource named in the URI
- OPTIONS -- Requests for server-supported request methods
- TRACE -- the Web Server Reports Http requests and their header request
- CONNECT -- a documented method that is not implemented yet. It is reserved for tunnel processing.
2) Request Header
Request Header: consists of key: value values, one pair per line. The request header is used to notify the server about the functions and identities of clients.
HOST -- the server address of the request, Master Address, for example, my technical blog: cuishen.iteye.com
User-Agent-a browser that can be used by a User as a client, such as Mozilla/4.0
Accept-a list of MIME types acceptable to the client, such as image/gif, text/html, and application/msword.
Content-Length -- only applicable to POST requests. The size of POST data is given in bytes.
3) Empty rows
The system sends carriage return and return, notifying the server that no headers are listed below.
4) request data
The Content-Type and Content-Length headers are most often used for data transmission using POST.
Summary of request packets:
We can write a standard HTTP request as follows:
POST/blog/242842 HTTP1.1 HOST: cuishen.iteye.com/ User-Agent: Mozilla/4.0 Accpt: image/gif, text/html, application/pdf, image/png... Key = value & key = value ...... (POST () request data) |
The preceding example indicates:
The address of the server I want to access is cuishen.iteye.com/its resources/blog/242842 The connection is: cuishen.iteye.com/blog/242842 This page uses the HTTP1.1 specification. My browser version is Mozilla/4.0. The supported MIME formats include image/gif, text/html, application/pdf, image/png... |
The MIME format is written in servlet: response. setContentType ("text/html; charset = gb2312 "); Alternatively, use the following code in jsp: <% @ page contentType = "text/html; charset = gb2312" %> Alternatively, use the following syntax in html: <meta http-equiv = "content-Type" content = "text/html; charset = gb2312"> |
The most intuitive difference between GET and POST is that the GET method follows the requested URL, that is, we do this in the request line:
- GET /blog/242842?key=value&key=value&key=value......HTTP1.1
Actually, GET is used to transmit data like this:
- http://cuishen.iteye.com/?page=2......
Iii. The server responds to the request to generate the result and returns the response)
Web Server Resolution request to locate the specified resource http://cuishen.iteye.com/blog/242842
1) Use the doGet ()/doPost () method in the servlet to process requests based on the GET/POST method. This may be some business logic or some verification, some data may also be queried, submitted, etc.) The valid data comes from key = value & key = value ......, and other data resources encapsulated in the request object.
2) After processing the request, the response object will get java. io. printWriter outputs the stream object out through out. println (); the data is in the specified format, for example, according to response. setcontentType ("text/html; charset = gb2312"); format output to the output stream.
The response message is very similar to the request message. The difference is that the request line in the request stage is replaced by the Status line. Let's look at the Response Message:
3) A response packet consists of four parts: Status line, Response Header mark, blank line, and response data:
(A). Status line:
A status line consists of three tags: HTTP Version, response code, and response description.
HTTP1.1 --- 100 --- continue // continue to append subsequent content HTTP1.1 --- 200 --- OK // everything is normal HTTP1.1 --- 301 --- Moved Permanently // the requested document is elsewhere and will be automatically connected HTTP1.1 --- 403 --- Forbidden // you are absolutely denied access to this resource, no matter whether the authorization is complete HTTP1.1 --- 400 --- Bad Request // Bad syntax in client requests HTTP1.1 --- 404 --- Not Found // The most common, definitely Not Found |
HTTP response code:
1xx: prompt information, indicating that the client should respond to some other actions 2xx: the request is successful. 3xx: redirection. actions that must be performed further to complete the request 4xx: client Error 500-599: Server Error |
(B). Response Header: like the request header, they indicate the functions of the server and identify the details of the response data.
Date: Sat, 31 Dec 2005 23:59:59 GMT -- Date and time of the response generation ContentType: 'text/html; charset = gb2312' Content-Length: 122 -- number of bytes in the response, which is required only when the browser uses the permanent Keep-alive) HTTP connection. |
(C). blank line: the last response header is followed by an empty line. The carriage return and fallback are sent, indicating that no header mark is available below the server.
(D). Response Data: HTML documents and images, that is, HTML itself. Out. println ("
- <Html>
- <Head>
- <Title> Welcome to cuishen's IT blog </title>
- </Head>
- <Body>
- <! -- Here is the specific content.
- I believe you know the working principle of HTTP and the interaction process between the client and the server.
- -->
- </Body>
- </Html>
Iv. The server closes the connection, the client parses and returns the response message, and restores the page.
1) the browser first parses the status line to check whether the request is successful. The HTTP response code is 404 400 200 ....
2) Parse each response header, such:
ContentType: text/html; charset = gb2312 Content-Length: 122 --- the number of bytes in the response, which is required only when the browser uses the permanent Keep-alive) HTTP connection. |
3) read the response data in HTML, and restore standard html pages or other pages based on the content in the tag
4) an HTML document may contain other resources that need to be loaded. the browser will identify them and make additional requests to these resources, this process can be cyclic until all data is restored to the page according to the format specified in the response header.
5) after data transmission is completed, the server closes the connection, that is, the stateless protocol.
3. Summary
Do not be intimidated by advanced terms and theories. In fact, the interaction between the HTTP client and the server is simple: the browser establishes a Socket stateless connection with the server, that is, a transient connection, then, the I/O Stream is used for the interaction of the packet information, which strictly follows the HTTP packet interface. The connection is closed after the session ends. Java and browsers have already encapsulated the implementation of these underlying protocols and packet packaging and unpacking interaction. Programmers only need to focus on the implementation of business logic, don't worry about this !!