Objective
As a programmer, it is impossible not to deal with the network. Now our mobile phones, computers, it is no exaggeration to say that leaving the network is a piece of ' scrap ', their role will be greatly discounted. The role of this article, mainly for non-network professional development personnel prepared to ' the shortest time, Understand the most network of knowledge ' as the premise of the pen.
Directory
- Overview
- Physical Layer
- Data Link Layer
- Network layer
- Transport Layer
- Application Layer
Overview
Let's get to know the different kinds of professional nouns that we know but don't understand.
Internet
Internet
The internet is the world's largest network, is "Network network." That is, the Internet is a huge network interconnected by all networks.
The composition of the Internet:
- Edge part: Host
- Core: A large number of networks and routers that connect these networks (this router is not our home router)
Ethernet
Ethernet is now the most commonly used local area network communication protocol, which transmits MAC frames over Ethernet. Since Ethernet only allows one computer to send data at the same time, there must be a set of detection mechanisms, which is the CSMA/CD protocol:
- Multi-point access: Multiple computers connected to a bus in a multi-point access mode
- Carrier monitoring: Each station must continuously detect the channel, regardless of whether it is being sent
- Collision detection: Side-send-side monitoring
Osi
The Open Systems Interconnect basic Reference Model, which can be communicated by any two systems, as long as this OSI standard is adhered to. OSI is a seven-layer protocol architecture, and TCP/IP is a four-layer protocol architecture, so we take a compromise approach, learning computer network principle is often used in the architecture of the five-layer protocol: Physical layer, Data link layer, network layer, Transport layer and application layer
Protocol architecture
Physical Layer
The world of computers is only 0 and 1, as you can see now the text of this article, stored in the computer is also a large string of 0 and 1 combination. But these numbers cannot be transmitted in real physical media, and need to be converted to optical signals or electrical signals, so this layer is responsible for converting these bitstream (0101) to the optical signal.
If there is no physical layer, then there is no internet, there is no data sharing, because the data can not flow in the network.
Data Link Layer
Data in this layer is no longer transmitted in the form of bitstream, but is divided into a single frame and then transmitted.
MAC address
Also known as the computer's hardware address, is cured on the adapter (network card) ROM accounted for 48-bit address. MAC addresses can be used to uniquely differentiate a computer, as it is unique across the globe
Packet switching
Since the data has to be split into one frame this time, the frames that exceed this MTU must be partitioned because different links specify different maximum frame lengths, that is, the MTU (the maximum transmission unit). For example, a lorry can transport 5 tons of cargo at a time, and a highway with a limit of 2 tons, then you have to divide 3 times transport.
Network Bridge
The bridge works at the data link layer, forwarding and filtering the received frames according to the destination address of the Mac frame.
Ethernet Switch
is actually a multi-interface bridge, each Ethernet switch interface is directly connected to a single host or another hub, can be easily implemented VLAN (virtual local Area network)
Mac Frames for Ethernet
The format for Mac frames is:
Mac frame Format
- Destination Address: 48-bit MAC address of the receiving party
- SOURCE Address: 48-bit MAC address for sender
- Type field: Flag What protocol is used on the previous layer, 0x0800 for IP datagram
Network layer
If only the data link layer has no network layer, the data can only be transmitted on the same link and not across the link. With the network layer, data can be transmitted across domains of different data links.
IP Address
The IP address, also known as the software address, is stored on the computer's memory with a IPV4 address of 32 bits and a IPV6 address of 128 bits.
IP Address and MAC address
- Use an IP address above the network layer, use the MAC address below the data link layer
- The IP address is the logical address, and the MAC address is the physical address
- IP packet Header source address and destination address in the transmission will not change, Mac frame header source address and destination address each router will change once
IP Address Classification
IP address = {< network number;, < host number;}
Class A address: 0.0.0.0 ~ 127.0.0.0
Class B Address: 128.0.0.0 ~ 191.255.0.0
Class C Address: 192.0.0.0 ~ 223.255.255.0
IP address after subnetting
IP address = {< network number, < subnet number;, < host number;}
For example, a unit has a class B IP address, 145.13.0.0, any datagram with the destination address of 145.13.x.x will be sent to router R on the network. After dividing the subnet into: 145.13.3.0
, 145.13.7.0
, 145.13.21.0
. But the external is still represented as a network, that is 145.13.0.0
. After the router R receives the message, it then sends it to the corresponding subnet according to the destination address.
Subnet mask
Generally consists of a string of 1 and a string of 0, regardless of whether the network is divided into subnets, the subnet mask and IP address to do the bitwise AND operation can be obtained network address.
All networks must use a subnet mask, and the subnet mask column must be in the routing table. If a network is not zoned, the subnet mask for that network is the default subnet mask.
The default subnet mask for Class A addresses is 255.0.0.0
The default subnet mask for Class B addresses is 255.255.0.0
The default subnet mask for Class C addresses is 255.255.255.0
Although dividing subnets adds flexibility, it reduces the total number of hosts that can be connected to the network.
IP addresses that make up the network
IP address = {< network prefix;, < host number;}
Using network prefixes, routing between non-categorical domains to select CIDR
For example, 128.14.35.7/20, which means that the first 20 bits are network prefixes, and the last 12 bits are host numbers. In addition, CIDR makes up a "CIDR address block" of successive IP addresses with the same network prefix.
Address Mask: CIDR uses a 32-bit address mask, similar to a subnet mask.
IP datagram
At the network layer, the data is transmitted in the form of an IP datagram (IP packet)
Format of IP datagrams
The first 20 bytes of the header are fixed lengths and are required for all IP datagrams. The latter 4 bytes are optional fields and their length is variable.
IP Datagram Header Fixed field analysis:
- Version number: version of IP protocol, IPV4 or IPV6
- Header Length: The length of the header is recorded, the maximum is 1111, which is 15 32-bit word length, which is 60 bytes. When the header length is not an integer multiple of 4 bytes, it needs to be populated with the last fill field.
- Service type: General useless
- Total length: The length of the sum of the header and data. The maximum is 216-1 = 65535 bytes. However, since the data link layer specifies a maximum length MTU for each frame's data length, the Ethernet specifies that the MTU is 1500 bytes, so the out-of-range datagram must be fragmented
- Identification: Each IP datagram is generated, the counter is + 1, and the value is assigned to the Identity field. And then in a datagram that needs to be fragmented, the same description is identified as the same datagram
- Flag: Occupies 3 bits, the lowest bit is recorded as MF (more Fragment). MF = 1 Indicates there are also shards; MF = 0 indicates that this is the last shard. The middle one is recorded as DF (Don ' t Fragment), meaning it cannot be fragmented. Shards are allowed only if df = 0 o'clock.
- Segment shift: Also known as slice displacement, where the slice starts with respect to the starting point of the user data field. The slice displacement is offset units in 8 bytes. Therefore, the length of each shard must be an integer multiple of 8 bytes.
- Lifetime: TTL (Time to live). The maximum number of times a datagram can pass through a router in the Internet is 255, and every router ttl–1, discarding the message for 0 o'clock.
- Protocol: What protocol is used to record the data carried by the message.
- First Test and: only test the header of the datagram, do not test the data section. The message is discarded for 0.
- Source Address and Destination address: not explained
The process of forwarding packets in IP layer
A routing table is maintained inside each router, and the routing table contains the following ( 目的网络地址
, 下一跳地址
).
When a subnet is used for group forwarding, the routing table must contain the following three items: 目的网络地址
, 子网掩码
and 下一跳地址
.
Specific host routes: Indicates a route for a specific destination address
Default route: The default route is issued when the packet is sent to which router. It is appropriate to use the default route when a network has very few external connections.
Packet forwarding algorithm for routers
- Get the destination IP address D from the datagram and obtain the destination network address n
- If n is a network address directly connected to this router, it is delivered directly (no additional router forwarding is required, directly to the destination host delivery), otherwise (3)
- If the routing table has a specific host route with Destination D, the datagram is passed to the router, otherwise (4)
- If there is a route in the routing table that reaches network n, the datagram is passed to the router, otherwise (5)
- If there is a default route in the routing table, it is given to the router, otherwise (6)
- Error Reporting forward grouping
Virtual Private Network VPN
All routers in the Internet do not forward the datagram for which the destination address is a private address, with 3 private addresses (virtual IP addresses) below
- 10.0.0.0 ~ 10.255.255.255
- 172.16.0.0 ~ 172.31.255.255
- 192.168.0.0 ~ 192.168.255.255
Suppose now company A has one department in Guangzhou and the other in Shanghai, and they have their own private network on the ground. So how do you connect these two private networks together?
- The communication line for the leased telecommunications is dedicated to this institution, but it is too expensive.
- Using the public Internet as the communication carrier, this is the virtual private network VPN
Network Address translation NAT
The IP address of a NAT router is common to hosts in multiple private networks, and network address translation must be done through the NAT router before the host sends and receives the IP data.
How the NAT router works
Not only that, NAT can also use port numbers as a network address and port conversion napt
ARP protocol
ARP is the problem of mapping the IP address and MAC address of the host or router on the same LAN, that is, the IP address, ARP-and MAC address
Each host has an ARP cache with the IP address of each host and router on the local area network to the MAC address mapping table. Here's how ARP works:
How ARP works. jpg
What if I use ARP across the network?
- Broadcast on this network
- The host is not found, then to the router
- Router to help forward (broadcast on another network)
- Complete ARP Request if found, return if not found (2)
Transport Layer
This layer is the most important, because the data link layer, the network layer of the two layers of data transmission are unreliable, to maximize the ability to deliver. What does that mean? Is that they are not responsible for submitting to you is the right data. However, this layer of TCP protocol will provide reliable transmission
The main focus of this layer is two protocols: UDP and TCP
User Datagram Protocol UDP
UDP Main Features:
- No connection
- Do your best to deliver
- Message-oriented: The application layer of the message sent directly with the UDP head to the IP layer, do not merge or split
- No congestion control
- Support for one-to-many, multi-pair and many-to-many interactive communication
- The first overhead is small, only 8 bytes
UDP header
UDP header Format
- SOURCE ports: Source port number. Choose when you need to reply, not all 0
- Destination ports: Destination port number. This must be used at the end of the delivery message
- Length: The length of the UDP datagram with a minimum value of 8 (header only)
- Inspection and: Unlike IP datagrams, the only test header is that UDP needs to test the header and data parts together
Transmission Control Protocol TCP
TCP Main Features:
- Connection-oriented Transport layer protocol
- Each TCP connection can have only 2 endpoints, and TCP is a point-to-point
- Deliver reliable delivery
- Full Duplex communication
- byte stream oriented
Workflow for TCP
TCP byte stream
Connection to TCP
The endpoint of a TCP connection is called a socket (socket)
socket = (IP地址 : 端口号)
Each TCP connection is uniquely determined by the two endpoints (sockets) on both ends of the communication. That
TCP连接 ::= {socket1, socket2} = {(IP1 : port1), (IP2 : port2)}
Header of the TCP message segment
Header of the TCP message segment
- Source port and Destination port: function with UDP port
- Ordinal: The ordinal of the first byte of the data in this section
- Confirmation Number: Expected to receive the first data byte ordinal of the next message segment
若确认号 = N, 则表明 : 到序号N-1为止的所有数据都已正常收到
- Data offset: The header length of the TCP message segment
- Reserved: For later use, currently 0
- Emergency Urg: If Urg = 1 o'clock, indicating that the emergency pointer field is valid, tell the system this is an emergency data, should be transmitted as soon as possible. For example, a sudden interruption of transmission
- Confirm Ack:ack = 1 O'Clock Confirmation number is valid, ack = 0 o'clock Confirmation number is invalid. TCP provides that all transmitted message segments must have an ACK set 1 after the connection is established
- Push PSH: If PSH = 1, the receiver will not wait until the entire cache is full but to deliver it directly after receiving the message segment
- Reset rst: When rst = 1, indicating a serious error in the TCP connection, the connection must be released and re-connected
- Synchronous SYN: Used to synchronize the sequence number when the connection is established. When SYN = 1, ack = 0 o'clock indicates that this is a connection request message segment, if the other party agrees to establish a connection, then Syn = 1, ack = 1 is placed in the message segment of the response.
- Terminate fin: when fin = 1, indicates that the sender data for this segment has been sent and requires that the connection be released
- window: Tell the other person: the amount of data that the receiving party is currently allowed to send from the confirmation number in the first paragraph of this section. This is the basis by which the sender is allowed to set its sending window as the receiving party
- Inspection and: With UDP, inspection header and data section
- Emergency pointer: When Urg = 1 o'clock is valid, indicates the position of the end of the emergency data in the message segment
- Optional: Maximum 40 bytes, no 0
最大报文段长度MSS(Maximum Segment Size) : 每一个TCP报文段中数据字段的最大长度, 若不填写则为默认的536字节.
Window
A very important concept in TCP, which is the window (send window and receive window)
Window
Because the stop waiting protocol is very inefficient, the concept of window is derived. The sending window, which is maintained by the sender, can be sent out in 5 groups at the sending window without waiting for confirmation from the other party. Each receipt of a confirmation, the sending window is moved forward a grouped position. This greatly improves the channel utilization!
Instead of sending acknowledgment packets for each packet, the receiver uses a cumulative acknowledgement. In other words, a confirmation message is sent to the last packet arriving in order.
Time-out retransmission
If the sender waits for a period of time, or if the ACK acknowledgement message is confiscated, a timeout retransmission is initiated. This wait time is redirected over time (RTO, retransmission timeout).
However, the value of RTO is not fixed, and this time is always slightly greater than the connection round trip time (rtt,round). Assuming that the message sent in the past takes 5 seconds, the other party received after sending a confirmation message back also need 5 seconds, then the RTT is 10 seconds, then this RTO will be more than 10 seconds slightly larger. After the RTO has not received a confirmation message that the message is lost, it is necessary to re-transmit.
Flow control
Flow control is carried out using the timing of the sliding window and the message segment.
Congestion control
The sender maintains a congested window of CWnd, sending the window = congested window.
慢开始
: CWnd = 1, then doubles every transmission pass
拥塞避免
: Let CWnd grow slowly, +1 per transmission pass
慢开始门限ssthresh
:
Ssthresh, use slow start algorithm when CWnd > Ssthresh, use congestion avoidance algorithm when CWnd = Ssthresh, casual
Congestion control
As long as the network congestion is determined, the Ssthresh is set as half of the current congestion window (cannot be less than 2), and CWnd is set to 1, re-execute the slow start algorithm.
In addition to slow start and congestion avoidance algorithms, there is a set of fast retransmission and fast recovery algorithms:
快重传
: The receiving Party sends the confirmation in time, and the sender receives three duplicate confirmations in a row, and immediately re-transmits
快恢复
: When the sender receives three duplicate confirmations in succession, Ssthresh is halved and CWnd is set to Ssthresh.
TCP Three-time handshake
The TCP three-time handshake establishes the connection and four wave disconnects are the knowledge points that the interview loves to ask.
TCP Three-time handshake
Q: Why do you have to shake hands three times, two times not?
A: Imagine, a first send a request to connect, but in a network node stranded, a time-out retransmission, and then this time everything is normal, A and b happily data transmission. When the connection is released, the lost connection request suddenly comes to B, and if it's a two-time handshake, B sends a confirmation that they're connected. In fact a does not bother with this confirmation, because I have no data to pass. But B was silly to think there was data to come and wait. The result is a waste of resources.
The explanation for the more grounded gas is: a call B.
First handshake: Hello, I'm a, can you hear me? The second handshake: heard, I am B, can you hear me? The third handshake: heard, we can start chatting three times the handshake is actually to detect both the transmission and reception ability is normal, you say?
TCP four times Wave
TCP four times Wave
Q: Why should I wave four times instead of two times, three times?
A:
First, because of TCP's full-duplex communication, both parties can act as data senders. A if you want to close the connection, you must wait until the data is sent to send fin to B. (At this point A is in a semi-closed state)
Then, B sends an acknowledgment ACK, and B sends it at this point (for example, to do some pre-release processing) If the data is to be sent
Furthermore, after sending out the data, B sends fin to a. (At this point B is in semi-closed state)
Then, a sends an ACK and enters the time-wait state
Finally, after 2MSL of time did not receive a message from B, then confirm that B received an ACK. (At this point A, B is completely off)
PS: Careful analysis of the above steps to know why not less than four times wave.
Q: Why wait for 2MSL (Maximum Segment Lifetime) time, from Time_wait to closed?
A: The client sends a final ACK reply, but the ACK may be lost. If the server does not receive an ACK, the fin fragment is repeatedly sent. So the client cannot shut down immediately, it must confirm that the server received the ACK. The client enters the TIME_WAIT state after sending an ACK. The client sets a timer that waits 2MSL of time. If you receive fin again within that time, then the client will re-send the ACK and wait for 2MSL again. The MSL is the maximum time that a fragment will survive on the network, and 2MSL is the maximum length required for a send and a reply. If no fin is received until 2msl,client, then the client infers that the ACK has been successfully received, ending the TCP connection.
Explanation of more grounded gas:
The first wave: A told B, I have no data sent, ready to close the connection, you want to send the data? Second wave: B send the last data wave for the third time: B Tell A, I also want to close the fourth wave: a tell B you can close, my side is closed.
Application Layer
The most famous Application layer protocol is HTTP, FTP, and an important DNS
Domain Name Systems (DNS, Domain Name System)
DNS can parse a domain name (for example, www.jianshu.com) into an IP address.
Domain Name Server classification
- Root name server: The highest level of domain name server
- Top-level domain name server: as its name
- Privileged Domain Name server: Responsible for a zone of the server
- Local domain Name server: The host sends a DNS query request is sent to it
DNS queries
DNS queries
- The host to the local domain name server query is generally used recursive query
- The local domain name server's query to the root domain server is typically an iterative query
Recursive query: B q a Guangzhou How to go, a don't know, a asked C, C don't know just ask D ... Until we know, one layer at a to tell B. Iterative query: B q a Guangzhou How to go, a don't know, a tell you can ask C, then B go to ask C, C don't know, C tell you can ask D, then B to ask D ... Until B knows.
DNS Query example: Host with domain name x.tom.com want to know y.jerry.com IP address
- Host x.tom.com first recursive query to the local domain name server dns.tom.com
- The local domain name server uses an iterative query. It first asks a root name server
- Root name server tell it, you ask the top-level domain server dns.com
- Local domain name server Q top-level domain name server dns.com
- Top-level nameservers tell it you're going to ask permission domain name server dns.jerry.com
- Local domain name server ask permission name server dns.jerry.com
- The rights domain name server dns.jerry.com tells the IP address of the host it is querying
- The local domain name server tells the host x.tom.com the query results
PS: This query uses UDP, and each domain name server uses caching to improve DNS query efficiency.
Url
Format of URL: <协议>://<主机>:<端口>/<路径>
, port and path can sometimes be omitted.
URL using the HTTP protocol: http://<主机>:<端口>/<路径>
the HTTP default port number is 80
HTTP protocol
HTTP is transaction-oriented, meaning that the data it transmits is a whole, either all received, or it is not received.
The working process of the World Wide Web
Each HTTP request needs to establish a TCP connection and release the TCP connection.
HTTP is non-connected, stateless. Each request is a new request.
http/1.0 disadvantage: No connection, every request to re-establish a TCP connection, so each HTTP request will take twice times the RTT time (one TCP request, one HTTP request)
http/1.1: Use persistent connection, that is, to keep the TCP connection for some time.
http/1.1 two ways of working: Non-assembly line and pipeline mode non-pipeline way: received a request for the response to send the next request, inefficient, waste of resources pipeline way: Can send multiple requests simultaneously, high efficiency
Get and post for HTTP
Get requests are typically used for querying, fetching data, and POST requests for sending data
The GET request parameter is in the URL, so sensitive data must never be transmitted with a GET request, and the parameters of the post request are slightly more secure than the GET request in the request header
The data requested by Ps:post is also stored in the request header in clear text and therefore not secure
Cookies
The World Wide Web uses cookies to track users, representing the status information passed between the HTTP server and the user.
How Cookies work:
1. A user browses a website where the server generates a unique identifier for the user and, as an index, generates a project in the server backend database 2. Add a "Set-cookie" to the HTTP response message returned to the user with a value of that identifier, such as 1233. The user's browser saves the cookie, and every HTTP request that is sent when it is used to continue browsing the site will have a line cookie:123 so, the site knows what the cookie is for the 123 user, and maintains a separate list for the user (such as a shopping cart).
Of course, a cookie is a double-edged sword, convenient and risky, such as privacy leaks, and users can decide whether to use cookies at their own discretion.
Session
The cookie is stored on the client, and the session is saved on the server. When the server receives a cookie from the user, it will find the corresponding session according to the SessionID in the cookie, and if not, a new sessionid will be generated to return to the user.
All in all, the cookie and the session are the same things that are stored in different places.
HTTPS
HTTPS protocol
HTTPS protocol on the basis of HTTP protocol, in the middle of HTTP and TCP to add a layer of SSL/TLS encryption layer, to solve the problem of HTTP insecure: impersonation, tampering, eavesdropping three major risks.
For HTTPS is how to do security, encryption and other interested can refer to the following articles
Overview of the operating mechanism of SSL/TLS protocol
HTTPS Popular Literacy Posts
The original address: You should know the computer network knowledge
You should know about the computer network knowledge