The 13th week summary of the Design foundation of information security system

Last Update:2015-12-06 Source: Internet

Author: User

Tags terminates subdomain

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The 11th Chapter Network programming

1. Web applications are ubiquitous. Anytime you browse the Web, send an email, or pop up an X window, you're using a Web application. Interestingly, all Web applications are based on the same basic programming model with a similar overall logical structure and rely on the same programming interface.

2. Web applications rely on many of the concepts that have been learned in system research, such as processes, signals, byte-map mappings, and dynamic storage allocations, all of which play an important role. There are some new concepts to master. We need to understand the basic client-server programming model and how to write client-server programs that use the services provided by the Internet. Finally, we will combine all these concepts to develop a small but fully functional Web server that provides both static and dynamic text and graphics content for real-world web browsers.

11.1 Client-Server programming model

1. Each network application is based on the client-server model. With this model, an application is a service provided by a server's client. The server manages a resource and provides some kind of service to its clients by manipulating that resource. -An FTP server manages a set of disk files that it stores and retrieves for clients. Similarly, an e-mail server manages a number of files that are read and updated for the client.

2. The basic operation in the client-server model is the transaction

3. The transaction consists of four steps

1) When a client needs a service, it sends a request to the server to initiate a transaction. For example, when a Web browser needs a file, it sends a request to the Web server

2) After the server receives the request, it interprets it and operates its resources in the appropriate manner. For example, when a Web server receives a request from a browser, it attends a disk file

3) The server sends a response to the client and waits for the next request. For example, the Web server sends the file back to the client;

4) The client receives a response and processes it. For example, when a Web browser receives a page from the server, it displays the page on the screen.

11.2 Network

Clients and servers typically run on different hosts and communicate through the computer's hardware and software resources. The network is a complex system, and here we just want to know a little fur. Our goal is to give you a working thinking model from the programmer's point of view. For a host, the network is just another I/O device, as a data source and data receiver. An adapter that plugs into an I/O bus expansion slot provides a physical interface to the network. The data received from the network is copied from the adapter through the I/O and the memory bus to the memory, typically via DMA (translator Note: Direct memory access). Similarly, data can be copied from memory to the network.

1. One Ethernet segment, including cables and hubs; Each cable has the same maximum bit bandwidth, and the hub replicates each bit that is received on one port to all other ports without a resolution. Therefore, each host can see each bit.

2. Each Ethernet adapter has a globally unique 48-bit address, which is stored on the non-volatile memory of this adapter. This frame is visible to each host adapter, but only the destination host actually reads it.

3. Bridging Ethernet connects multiple Ethernet segments by cables and bridges, resulting in a larger LAN. The cable transfer rates for connecting bridges can vary (e.g. 1gb/s between bridges and bridges, 100mb/s between bridges and hubs).

4. Bridge function: Connect different network segments. When a to B transmits data in the same network segment, the frame arrives at the bridge input port, and the bridge discards it and does not forward it. A when transferring data to C in another network segment, the bridge copies the frame to the port connected to the corresponding network segment. Thus saving the bandwidth of the network segment

(1) Basic capabilities of the Protocol software:

The naming mechanism assigns at least one Internet address to each host, thus eliminating the differences in the format of different host addresses, which can be identified by each host. The transport mechanism encapsulates data in different formats so that it has the same format.

11.3 Global IP Internet

The global IP Internet is the most famous and successful internet implementation. Since 1969, it has existed in such or such form. Although the internal architecture of the Internet is complex and changing, the organization of client-server applications has remained fairly stable since the early the 1980s. Shows a basic hardware and software organization for an Internet client-server application. Each Internet host runs software that implements the TCP/TP protocol, which is supported by almost every modern computer system. The Internet's client and server mix uses socket interface functions and UNIX I/O functions to communicate. Socket functions are typically implemented as system calls that fall into the kernel and invoke TCP/IP functions of various kernel modes. 、

11.3.1 IP Address

1. An IP address is an 32-bit unsigned integer.

2. The network program stores the IP address in the IP address structure shown.

11.3.2 Internet domain name

1. The Internet client and server communicate with each other using an IP address. However, large integers are hard to remember for people, so the internet also defines a more user-friendly domain name and a mechanism for mapping domain names to IP addresses. A domain name is a string of words separated by a period (letters, numbers, and dashes).

2. The Domain name collection forms a hierarchical structure in which each domain name encodes its position in this hierarchy. With an example you will easily understand this. Part of the domain name hierarchy is shown below. Hierarchies can be represented as a tree. The node of the tree represents the city name, and the path to the root forms the domain name. A subtree is called a subdomain. The first layer in the hierarchy is an unnamed root node. The next layer is a group of first-level domain names defined by non-profit organizations (the Internet Wine Name Number Association). Common first-tier domain names include com, edu, gov, org, net, which are assigned by ICANN's various authorized agents on a first-come-first-served basis. Once an organization has a two-level domain name, it can create any new domain name in the subdomain.

11.3.3 Internet connection

1. Internet clients and servers communicate by sending and receiving byte streams on the connection. In the sense of connecting a pair of processes, the connection is point-to-point. It is full-duplex from the point of view where the data can flow in both directions. and from (except for some, such as careless tiller operators cut off the cable causing disaster-to-sexual failure), the stream of bytes emitted by the source process is ultimately trusted by the destination process to receive it in the order in which it is issued.

2. A socket is an endpoint of a connection. Each socket has a corresponding socket address, which consists of an Internet address and a 16-bit integer port, denoted by "Address: Port". When a client initiates a connection request, the port in the client socket address is automatically assigned by the kernel, called the ephemeral port. However, the port in the server socket address is usually a well-known port, which corresponds to the service. For example, a Web server typically uses port 80, and the e-mail server uses port 25.

11.4 Socket Interface

11.4.1 Socket Address Structure

From the Unix kernel's point of view, a socket is an endpoint of communication.

11.4.2 Socket function

The socket function client and server use a function to create a socket descriptor.

Among them, Af_inet shows that we are using the Internet, and scket_stream that the socket is an Internet connection endpoint. The CLIENTFD descriptor returned by the socket is only partially open and cannot be used for read-write. How to complete the job of opening a socket depends on whether we are a client or a server.

11.4.3 Connect function

The client establishes a connection to the server through the Connect function.

11.4.4 open_clientfd function

Wrap the socket and connect

11.4.5 bind function

Bind function

11.4.6 Listen function

The Listen function converts SOCKFD from an active socket to a listening socket. The socket can accept connection requests from the client. The backlog parameter implies the number of outstanding connection requests that the kernel waits in the queue before it starts rejecting the connection request

11.4.7 OPEN_LISTENFD function

11.4.8 Accept function

11.5 Web Server 11.5.1 Web base

The interaction between the 1.Web client and server is a text-based application-level protocol called HTTP.

2.HTTP is a simple protocol. A Web client (that is, a browser) opens an Internet connection to the server. The browser reads the content and requests some content. The server responds to the requested content, and then closes the connection. The browser reads it and displays it inside the screen

3. The main difference is that Web content can be written in HTML. An HTML Program (page) contains directives (tags) that tell the browser how to display various text and graphics objects on this page.

11.5.2 Web content

The Web server provides content to the client in two different ways:

1. Take a disk file and return its contents to the client.

2. Run an executable file and return its output to the client.

11.5.3 HTTP Transactions

1.http Request

2.http response

11.5.4 Service Dynamic Content

1. How the client passes program parameters to the server

2. How the server passes parameters to child processes

3. How the server passes additional information to child processes

4. Where does the child process send its output

12th Chapter Concurrent Programming

So far, we've mostly seen concurrency as a mechanism that the operating system kernel uses to run multiple applications. However, concurrency is not confined to the kernel. It can also play an important role in the application. For example, we have seen how UNIX signal handlers allow applications to respond to asynchronous events, such as user typing. Or the program accesses an undefined area of the virtual storage. Application-level concurrency is also useful in other situations, such as:

1. Access Slow i/o devices: When an app is waiting for data from a slow i/o device (such as a disk) to arrive, internal i/O requests and other useful work to use concurrency.

2. Interact with people. The person interacting with the computer requires the computer to have the ability to perform multiple tasks at the same time, for example, when they print a document, they may want to resize a window. Modern Windows systems use concurrency to provide this capability, and each time a user requests something (such as a mouse click, a separate concurrent logical stream is created to perform this operation.)

< Span class= "Hljs-keyword" >3.??? Postpone work to reduce latency. Sometimes applications can reduce the latency of certain operations by delaying other operations and concurrently executing them to take advantage of concurrency. For example, a dynamic storage allocator can be postponed by merging. Put it in a concurrent "merge" stream running on a lower priority to take advantage of these idle cycles to reduce the delay of a single free operation when there is an idle CPU cycle

< Span class= "Hljs-keyword" >4. Service multiple network clients. The iterative Web servers we learn in chapter U are unrealistic because they can only serve one client at a time. Therefore, a slow client may cause the server to deny service to all other clients. For a real server, it may be expected to serve hundreds of thousands of clients per second, which is not acceptable because a slow client is causing the denial of service to other clients. A better approach is to create a concurrent server that creates a separate logical flow for each client. This allows the server to serve multiple clients at the same time, and this also avoids the slow client exclusive server.

5. Parallel computing multi-core processors on multicore machines, with multiple CPUs in a multi-core processor. Applications that are divided into concurrent streams are typically executed in parallel, rather than interleaved, in a multi-core machine that contains more than one of these flows in the runtime manager on a single-processor machine.

6. Modern operating systems provide three basic methods for constructing concurrent programs: process,i/o multiplexing, threading.

12.1 Process-based concurrency programming

1. After accepting the connection request, the server derives a child process that obtains a full copy of the server descriptor. The child process closes the Listener Descriptor 3 in its copy, and the parent process closes its copy of the connected descriptor 4, because these descriptors are no longer needed. This gives you the status in the diagram where the subprocess is busy servicing the client. Because the connection descriptor in a parent-child process points to the same file table entry, it is critical that the parent process turn off the copy of its connected descriptor. Otherwise, the file table entry for the connected descriptor 4 will never be freed, and the resulting memory leak will eventually consume the available memory, causing the system to crash.

2. Now assume that after the parent process has created a child process for client 1, it accepts a new Client 2 connection request and returns a new connected descriptor (such as descriptor 5). The parent process then derives another child process, which serves its clients with the connection Descriptor 5. At this point, the parent process is waiting for the next connection request, and two child processes are providing services to their respective clients.

12.1.1 process-based concurrent serverscode for a process-based and concurrent ECHO server, important note:

First, the server usually runs for a long time, so we must include a SIGCHLD handler to reclaim the dead child process resources. Second, the parent-child process must close their CONNFD copy. Finally, the connection to the client terminates because the reference count of the file table entry for the socket is counted until the connfd of the parent-child process is closed.

12.1.2 about the pros and cons of the process

With regard to the merits and demerits of the process, the process has a very clear model for sharing state information between the parent and child processes: The shared file table, but does not share the user address space. Process has separate address controls loving you is both an advantage and a disadvantage. Because of the independent address space, the process does not overwrite the virtual memory of another process. On the other hand, inter-process communication is cumbersome, at least costly.

12.2 Concurrent programming based on I/O multiplexing

1. For example, a server that has two I/O events: 1) A network client initiates a connection request, 2) the user types the command line on the keyboard. Shall we wait for the incident first? It is ideal not to have that choice. If the accept waits for a connection, the command cannot be entered accordingly. If you wait for an input command in read, we cannot respond to any connection request (this is a process).

2. One solution to this dilemma is the I/O multiplexing technology. The basic idea is to use the Select function, which requires the kernel to suspend the process and return control to the application only after one or more I/O events have occurred. : A horizontal square can be seen as a vector of n-bit descriptors. Now, we define the NO. 0 bit description as "standard input" and the 3rd bit descriptor as "listener descriptor".

12.2.1 concurrency Event-driven server based on I/O multiplexing

I/O multiplexing can be used as the basis for concurrent event drivers, where the flow is the result of an event, and the general concept is to model the logical flow as a state machine, not strictly speaking, a state machine is a set of states, input events and transitions, where transitions are mapping state and input events to States, Each transfer will map a (input state, input event) pair to an output state, since the loop is the transfer between the same input and output state, usually the state machine is drawn as a forward graph, where the node represents the state, there is a transition to the arc, and the marking on the arc indicates the event of the sender, a state machine starts from some initial state Each input event raises a transfer from the current state to the next state, and for each new client K, the concurrent server based on I/O multiplexing creates a new state machine s and connects it to the connected descriptor D.

Advantages and disadvantages of 12.2.2 I/O multiplexing technology

1. One advantage of event-driven design is that it gives programmers more control over program behavior than process-based design. For example, we can imagine writing an event-driven concurrent server that provides some customers with the services they need, which is difficult for concurrent servers in a new process

2. Another advantage is that an I/O multiplexing-based event drive is run in a single process context, so each logical flow can access the full address space of the process. This makes it easy to share data between streams, and one of the advantages associated with running as a single process is that you can use familiar debugging tools, such as GDB, to debug your concurrent servers, just like in a sequential program. Finally, event-driven designs are often much more efficient than the design based on the benefit, because they do not require process context switching to dispatch new streams.

3. One obvious drawback of event-driven design is the complexity of coding, and the need for our event-driven concurrent servers to refer to the number of instructions executed per time slice per logical stream. Another major drawback of event-based design is that they do not fully benefit from multicore processors.

12.3 Thread-based concurrency programming

Each thread has its own thread context, including a thread ID, stack, stack pointer, program counter, general purpose register, and condition code. All threads running in a process share the entire virtual address space of the process. Because the thread runs in a single process, it shares the entire contents of the process's virtual address space, including its code, data, heap, shared library, and open files.

12.3.1 Threading Execution Model

The model that the thread executes. The execution model of threads and processes is somewhat similar. The declaration cycle for each process is a thread, which we call the main path. But people have to be aware that threads are peers, and that the main thread differs from other threads in that it executes first.

12.3.2 POSIX threads

A POSIX thread is a standard interface for processing threads in a C program. It first appeared in 1995 and is available on most UNIX systems. Pthreads defines approximately 60 functions that allow the program to create, kill, and reclaim threads, share data securely with peer threads, and also notify peer-to system state changes.

12.3.3 Creating Threads

Threads create additional processes by calling the Pthread_create function.

12.3.4 terminating a thread

A thread is terminated in one of the following ways.

Line Cheng terminates when thread routines on the When top layer return

By calling the Pthread_exit function, the thread will show that it waits for all other peer threads to terminate before terminating the termination.

A peer thread calls the UNIX Exit function, which terminates the process and all threads associated with the process

12.3.5 recovering resources for terminated threads

12.4 Shared variables in multi-threaded programs

Global variables and static variables are stored in the data segment, so multithreading is shared!

Because the stack of threads is independent, the automatic variables in all threads are independent. Even if multiple threads run the same total number of automatic variables for the same piece of code, their values are different depending on the thread.

For example, in C + +, class attributes are not in the user stack. So the thread shares it!

12.4.1 Thread Memory model

1. A set of concurrent threads runs in the context of a process. Each thread has its own separate thread context, including thread ID, stack, stack pointer, program counter, condition code, and general purpose register value. Each thread shares the remainder of the process context with other threads. This includes the entire user virtual address space, which consists of read-only text code, read/write data, heaps, and all shared library code and data regions. Threads also share the same set of open files.

2. From a practical point of view, it is not possible to have one thread read or write the register value of another thread. On the other hand, any thread can access any location of the shared virtual storage. If a thread modifies a memory location, then each of the other threads will eventually be able to see the change as it reads this location. Therefore, the registers are never shared, and the virtual memory is always shared.

3. The memory model of the respective independent line stacks is not so neat and clear. These stacks are stored in the stack area of the virtual address space and are usually accessed independently by the corresponding thread. We say that usually, not always, because different thread stacks are not fortified by other threads, so if a thread gets a stacks in a way that points to another line, it can read and write to any part of the stack.

12.4.2 Mapping variables to memory

In threaded C programs, variables are mapped to virtual storage according to their storage type:

1. Global variables. A global variable is a variable defined outside a function, and at run time, the Read/write Region field of the virtual memory contains only one instance of each global variable that any thread can reference. For example, the global variable declared in line 5th has a runtime instance in the read/write area of the virtual memory, and we only use the variable name (in this case, PTR) to represent the instance.

2. Local automatic variable, local automatic variable is defined inside the function but there is no static property of the variable, at run time, each thread's stack contains its own instance of all local automatic variables. This is true even when multiple threads are executing the same thread routine. For example, there is an instance of the local variable tid, which is stored in the stack of the main thread. We use TID.M to represent this example.

3. Local static variables

12.4.3 Shared variables

We say that a variable v is shared when and only if one of its instances is referenced by more than one thread. For example, the variable CNT in the sample program is shared because it has only one runtime instance, and the instance is referenced by two peer threads on the other hand, myID is not shared because each of its two instances is referenced by only one thread. However, it is important to realize that local automatic variables such as msgs can also be shared.

12.5 Synchronizing Threads with semaphores

Semaphores are often referred to as PV operations, although the idea is to protect the critical code to achieve mutually exclusive effects. This inside the operating system is used to thread hangs.

Break the loop code of thread I into five parts:

12.5.1 Progress Chart

The process diagram models the execution of n concurrent processes into a trace line in an n-dimensional Cartesian space.

12.5.2 Signal Volume

The semaphore S is a global variable with a nonnegative integer value and can only be handled by two special operations, both of which are called P and V

P (s): if S is nonzero, then p will subtract s by 1 and return immediately. If S is zero, then the thread is suspended until s becomes nonzero, and a Y operation restarts the thread. After a restart, the P operation will subtract s by 1 and return control to the caller

V (s): v operation will add S 1. If there is any thread blocking in P operation waiting for s to become nonzero, then the V operation will restart one of these threads, then the thread will reduce s by 1, complete its p operation, the test in P and the minus 1 operation is indivisible, that is, once the predicted semaphore s becomes nonzero, it will reduce the s by 1 and cannot be interrupted. The plus 1 operation in V is also indivisible, that is, there is no interruption in the process of loading, adding, and storing semaphores. Note that the definition of V does not define the order in which the waiting threads are restarted. The only requirement is that V must only restart a waiting process.

12.5.3 using semaphores to achieve mutual exclusion

Semaphores provide a convenient way to ensure mutually exclusive access to shared variables. The basic idea is to associate each shared variable (or a set of related shared variables) with a semaphore. The semaphore that protects a shared variable in this way is called a two-dollar semaphore because its value is always 0 or 1. The two-dollar semaphore, which is intended to provide mutual exclusion, is often also referred to as a mutex. Performing P operations on a mutex is known as locking the mutex. Similarly, performing a V operation is known as unlocking the mutex lock. A thread that has a lock on a mutex but has not yet been unlocked is called an exclusive lock. A semaphore that is used as a counter for a set of available resources is called a count semaphore. The key idea is that this combination of P and V operations creates a set of states called the Forbidden Zone. Because of the invariance of the semaphore, there is no practical trajectory line that can contain the state in the forbidden area. Moreover, since the Forbidden Zone completely includes the unsafe zone, there is no practical track line that can touch any part of the unsafe zone. As a result, each practical trajectory is safe, and the program correctly increments the value of the counter regardless of the order of the runtime directives.

12.5.4 using semaphores to dispatch shared resources

Producer-consumer issues.

12.5.5 Synthesis: Pre-threading-based concurrent servers

In the concurrent server, we create a new thread for each new client The disadvantage of this approach is that we create a new thread for each new client, which results in a small price. A pre-threaded server attempts to reduce this overhead by using a producer-consumer model. A server is composed of a main thread and a set of worker threads. The main thread continuously accepts the connection request from the client and places the resulting connection descriptor in an unlimited buffer. Each worker thread repeatedly removes the descriptor from the shared buffer, serves the client, and then waits for the next descriptor.

12.6 using threading to improve parallelism

So far, in the study of concurrency, we have assumed that concurrent threads are in a single place many modern machines have multicore processors. Concurrent programs are usually executed on such a machine on the handler system. However, scheduling these concurrent threads on multiple cores in parallel, rather than dispatching them sequentially in a single kernel, is critical to exploiting this parallelism in applications such as busy Web servers, database servers, and large scientific computing code.

12.7.1 thread safety in our programming process, we write thread-safe functions whenever possible, that is, a function that always produces the correct result when it is called repeatedly by multiple concurrent threads. If this condition is not done, we call it a thread unsafe function. The following is a description of class four thread unsafe functions:

Functions that do not protect shared variables. The solution is the PV operation. Maintains a state function that spans multiple calls. For example, a function that uses static variables. The workaround is not to use static variables or to use a readable static variable. Returns a function that points to a pointer to a static variable. The workaround is to lock-and-copy (shackle-copy) A function deadlock that calls the thread unsafe function.

12.7.4 Competition

Competition occurs when the correctness of a program relies on one thread to reach the X point in its control flow before another thread reaches the y point. The competition usually occurs because the programmer assumes that the thread will work correctly on a particular trajectory and forgets another guideline that the threaded program must work correctly for any viable trace line.

12.7.5 deadlock

1. The semaphore introduces a potentially nasty runtime error called a deadlock. It refers to a set of threads that are blocked, waiting for a condition that will never be true. A progress map is an invaluable tool for understanding deadlocks.

2. Important knowledge about deadlocks:

3. Programmers use P and V operations incorrectly, so that the forbidden areas of two semaphores overlap. If one of the execution trajectories happens to reach the deadlock state D then there is no further progress, since overlapping blocks block progress in each legal direction. In other words, the program is deadlocked because each thread waits for a V operation that is not possible at all

4. Overlapping forbidden areas cause a set of states known as deadlock zones. Trace lines can go into the deadlock zone, but they cannot be left.

5. Deadlock is a very difficult problem, because it is not always predictable. Some lucky lines of execution will bypass the deadlock area, while others will fall into the area.

Resources

1. Teaching Materials: Chapters 11th and 12, detailed learning guidance: http://group.cnblogs.com/topic/73069.html

2. Course Materials: https://www.shiyanlou.com/courses/413 Experiment 10, Course invitation code: W7FQKW4Y

3. The code in the textbook run, think about, read the Code learning method: Http://www.cnblogs.com/rocedu/p/4837092.html.

4. Summary of Contents: Http://www.tuicool.com/articles/qAZnea

Problems encountered

1. Benefits of threading versus process and I/O multiplexing

2.socket function Some code is not fully understood, such as what Makeword mean

The 13th week summary of the Design foundation of information security system

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More