Web and network basics, HTTP packets, working principles, HTTP Communication Technology in Java, java Communication Technology
I. web and network Basics
1. HTTP history
1.1 concept of HTTP:
HTTP (Hyper Text Transfer Protocol) is a communication Protocol that allows the Transfer of Hypertext Markup Language (HTML) documents from a Web server to a client browser.
It is an application layer protocol that carries over TCP. A standard client server model consisting of requests and responses
1.2. HTTP development history:
As the HTTP protocol for Web file transfer, the version update is very slow. Currently, only three versions are updated:
HTTP/0.9: no standard was set up in 1990
HTTP/1.0: August 1, May 1996,
HTTP/1.1: January 1997, the current mainstream HTTP Version
1.2 importance of HTTP in Web applications:
HTTP is the foundation of the entire Web, and many applications do not need to understand the HTTP protocol:
WebService = HTTP + XML
Reset = HTTP + JSON
Implementation of various APIs: HTTP + XML/JSON
Collection and thieves site
Desktop applications such as QQ and thunder
2. Basic Network TCP/IP protocol
2.1. Concepts of protocols:
Protocols refer to the regulations or rules that must be followed when two computers communicate in a computer communication network.
There are various protocols in the computer network, such as TCP, IP, HTTP, FTP, and so on, it becomes the TCP/IP protocol family.
2.2 hierarchical management of TCP/IP
Application Layer: processes specific applications, such as FTP, DNS, and HTTP.
Transport Layer: provides data connections between two computers in a network connection to the upper application layer. This layer has two Protocols: TCP and UDP.
Network Layer: processes the flow of data packets on the network. The layer protocols include IP, ICMP, and IGMP.
Data link layer: process the hardware part of the connected network. Including the operating system, hardware driver, Nic, and optical fiber.
2.3 TCP/IP communication and transmission process
3. close relationship between IP, TCP, DNS and HTTP
3.1 concepts and functions of IP protocol
IP (Internet Protocol), Internet Protocol, at the network layer
Send various data packets to the other party. The two most important conditions are the IP address and MAC address (the MAC address will not change, but it can be changed through software)
ARP Address Resolution Protocol: detects MAC addresses based on IP addresses
3.2 concepts and functions of TCP
TCP (Transmission Control Protocol), which is located at the transport layer.
Provides reliable byte stream service and uses a three-way handshake Policy
3.3 TCP three-way handshake
3.4 concept and role of DNS
DNS (Domain Name System) is located at the application layer.
Provides resolution services between domain names and IP addresses
3.5. Relationship between HTTP and TCP, IP, and DNS
To generate HTTP request packets, you must use DNS resolution to find the server and transmit requests over TCP.
4. Concepts of URI and URL
4.1 concepts of URI:
URI (Uniform Resource Identifier), a unified Resource Identifier, is a string used to identify an Internet Resource Name.
4.2 concept of URL:
URL (Uniform Resource Location), which specifies the Location of a specific Resource on a specific server.
4.3 relationship between URI and URL:
URI identifies an Internet resource with a string, and a URL represents the address of the resource. It can be seen that a URL is a subset of the URI.
A URI consists of two major subsets: URL and URN.
4.4 URI format, which consists of nine parts
- For example URI: https: // user: pass@www.example.com: 80/home/index.html? Age = 11 # mask
Http: protocol solution name. To obtain resources, specify the protocol type.
User: pass: logon information (authentication). This parameter specifies the user name and password. Optional.
Www.example.com: server address
80: port number. Optional.
/Home/index.html: file path
Age = 11: parameter component that provides the application with additional information needed to access resources
Query string: for resources in the specified file path, you can use the query string to input any parameter. Optional (case by page)
Mask: The fragment identifier. It can mark the sub-Resources in the obtained Resource (a location in the Document). Optional.
Ii. elaborate on the HTTP message format and workflow
1. Formats of HTTP request messages and response packets
1.1 concepts of HTTP transactions and message streams
HTTP transaction = Request command + response result
1.2 Request Message format (packet capture tool, fiddler)
Request Line: Request Method (uppercase) request URL version
Request Header: Name: Value
Empty rows:
Message Body:
The request header is also called a message header. The field names are case-insensitive and are used to the camper mode. The fields can be arranged in any order. Some fields can have multiple value options, and some fields can appear multiple times.
1.2.1 request methods: GET, POST, HEAD, OPTIONS, DELETE, PUT
By default, GET requests are sent. For example, if you directly enter an address in the browser to access the site, and click the hyperlink to access the site, you can change the Request Method to post by changing the form submission method. The difference between POST and GET Methods: The method for passing parameters; the size of data transferred
1.2.2 common request headers:
Accept: the MIME type acceptable to the browser */* (large type)/(small type)
Accept-Charset: Tell the server which character set is supported by the browser
Accept-Encoding: The data Encoding method that the browser can decode, such as gzip.
Accept-Language: the Language type that the browser wants.
Host: Host and port in the initial URL
Referer: contains a URL from which the user sends a page representing the current request.
Content-Type: Content Type
If-Modified-Since: The value is a GMT standard time. If the requested file has not changed Since this time, the server informs the browser that the file can be directly read from the cache.
User-Agent: User's browser type, operating system, and other information
Content-Length: the Length of the Request Message Body.
Connection: If the value is Keep-Alive, a persistent Connection is required. HTTP 1.1 performs a persistent Connection by default.
Cookie: This is one of the most important request header information. However, this function is replaced by Session due to security risks and other reasons.
Date: The value is GMT, indicating the request time.
1.3. Response Message format
Response line: Protocol version status code reason phrase
Response Header:
Empty rows:
Message Body:
1.3.1. Status Code: Used to indicate various processing results and statuses of the server's requests. It is a three-digit decimal number. Response status codes fall into five categories:
1.3.2. Common status codes:
200: normal
301: Permanent redirection
302/307: Temporary redirection
304: not modified. You can use the cache without modifying it again.
404: resource not found
500: Internal Server Error
1.3.3 common response headers:
Location: https://cn.bing.com/indicates the new resource Location.
Server: apache tomcat indicates the Server type
Content-Encoding: Encoding type used by the gzip server to send data
Content-Length: 80 indicates the Length of the browser body.
Content-Language: Language of the text sent by the zh-cn Server
Content-Type: text/html; charset = GB2312 the MIME Type of the Content sent by the server
Last-Modified: The Last modification time of the GMT time file
Refresh: 1; url = https://www.baidu.com indicates the client Refresh frequency, in seconds
Content-Disposition: attachment?filename=aaa.zip instructs the client to download the file
Set-Cookie: SS = Q0 = 5Lb_nQ; path =/Cookie sent by the search Server
Expires: GMT indicates the expiration time. The value 0 or-1 indicates that cache is disabled.
Cache-Control: no-cache (1.1) indicates the hexadecimal Cache.
Connection: close/Keep-Alive
Date: GMT
2. HTTP Workflow
2.1 process steps
2.2 domain name resolution process
2.3 three-way handshake
2.4 initiate an HTTP request
2.5 respond to HTTP requests and get HTML code
2.6 parse HTML code in a browser
2.7 rendering the page by the browser to the user
3. Differences between HTTP1.0 and HTTP1.1
3.1 basic running mode of HTTP1.0:
A transaction is divided into four processes: establishing a connection, sending a request from a browser, sending a response from the server, and closing a connection. Each connection only processes one request and response. A separate connection must be established between the browser and the server to access each file.
3.2. Features of HTTP1.1
A TCP connection can send the HTTP request and response of the lock brother.
Multiple requests and responses can overlap.
Added more request headers and response headers, such as Host and If-Unmodified-Since request headers.
4. Use Telnet to connect to an instance
The Telnet protocol is a member of the TCP/IP protocol family and is the standard protocol and main method of the Internet remote Logon Service. It provides users with the ability to complete remote host work on local computers. Use the telnet program on the terminal computer to connect to the server
Iii. HTTP Communication in Java
1. Use HTTP Get to read Network Data
Class ReadByGet extends Thread {@ Override public void run () {URL url = new URL ("url"); // if there is a parameter, URLConnection conn = URL is included in the url. openConnection (); InputStream is = conn. getInputStream (); InputStreamReader isr = new InputStreamReader (is); BufferedReader br = new BufferedReader (isr); String line; StringBuilder builder = new StringBuilder (); while (line = br. readLine ())! = Null) {builder. append (line);} br. close (); isr. cloae (); is. close (); System. out. println (builder. toString) ;}} public static void main (String [] args) {new ReadByGet (). start ();}
2. Use HTTP Post to communicate with the network
Class ReadByPost extends Thread {@ override public void run () {URL url = new URL ("url"); HttpURLConnection conn = (HttpURLConnection) URL. openConnection (); conn. addRequestProperty ("encoding", "UTF-8"); conn. setDoInput (true); Conn. setDoOutput (true); conn. setRequestMethod ("POST"); OutputStream OS = conn. getOutputStream (); OutputStreamWriter osw = new OutputStreamWriter (OS); BufferedWriter bw = new Buffer EdWriter (osw); bw. write ("parameters passed to the server"); bw. flush (); InputStream is = conn. getInputStream (); InputStreamReader isr = new InputStreamReader (is); BufferedReader br = new BufferedReader (isr); String line; StringBuilder builder = new StringBuilder (); while (line = br. readLine ())! = Null) {builder. append (line);} // closes the resource System. out. println (builder. toString) ;}} public static void main () {new ReadByPost (). start ();}
3. Use HttpClient for Get Communication
Apache has an HttpClient package
class Get extends Thread{ HttpClient client = HttpClients.createDefault(); @Override public void run(){ HttpGet get = new HttpGet("https://www.baidu.com"); HttpResponse response = client.execute(get); HttpEntity entity = response.getEntity(); String result = EntityUtils.toString(entity,"UTF-8"); System.out.println(result); }}public static void main(String[] args){ new Get().start();}
4. Use HttpClient for Post Communication
Class Post extends Thread {HttpClient cilent = HttpClients. createDefault (); @ Override public void run () {HttpPost post = new HttpPost ("url"); // you can specify the List of parameters to be uploaded.
Parameters = new ArratList (); parameters. add (new BasicNameValuePair ("key", "value"); post. setEntity (new UrlEncodeFormEntity (parameters, "UTF-8"); HttpResponse response = client.exe cute (post); HttpEntity entity = response. getEntity (); String result = EntityUtils. roString (entity, "UTF-8"); System. out. println (result) ;}} main () {new Post (). start ();}