20145317 "Fundamentals of Information Security system design" 13th Week study summary textbook Learning content Summary Network programming client-server programming model
- An application consists of a server process and one or more client processes
- Server process, manage some kind of resource, by manipulating this resource to provide a service to its clients
- Basic operations: Transactions
- A client-server transaction consists of four steps:
- When a client needs a service, it sends a request to the server to initiate a transaction.
- After the server receives the request, it interprets it and operates its resources in the appropriate manner.
- The server sends a corresponding message to the client and waits for the next request.
- The client receives a response and processes it.
- Both the client and the server are processes
Internet
- For a host, a network is an I/O Device: Data received from the network is copied from the adapter through I/O and the memory bus to the memory, typically via DMA (direct memory access).
- Physically, the network is a hierarchical system of geographical proximity: the lowest layer is LAN (LAN), the most popular LAN technology is Ethernet.
- Ethernet Segment
- Includes some cables and hubs. Each cable has the same maximum bit bandwidth, and the hub replicates every bit that is received on one port to all other ports, so each host can see each bit.
- Each Ethernet adapter has a globally unique 48-bit address stored on the adapter's non-volatile memory.
- A host can send one segment: frame, to any other host in this network. Each frame consists of a fixed number of head bits (identifying the source and destination address and frame length of this frame) and data bits (payloads). Each host can see this frame, but only the destination host can read it.
- With cables and bridges, multiple Ethernet segments can be connected to a larger LAN, known as bridged Ethernet. The bandwidth of these cables can be different.
- Multiple incompatible LANs can be connected by a special computer called a router to form an Internet internetwork.
- Important features of the Internet: the use of different technologies, non-compatible LAN and WAN composition, and can make it communicate with each other. The solution to communication between different networks is a layer of protocol software running on each host and router, eliminating the differences between different networks.
- Two basic capabilities provided by the Protocol
- Naming mechanism: Uniquely identifies a single host
- Transfer mechanism: Defines the same way that data bits are bundled into discontinuous slices
Global IP Internet
- TCP/IP protocol family
- The Internet's client and server mix uses socket interface functions and UNIXI/O functions to communicate
- Consider the internet as a world-wide host collection that meets the following features:
- The host collection is mapped to a set of 32-bit IP addresses
- This set of IP addresses is mapped to a set of identifiers called Internet domain names
- Processes on the Internet host can be connected through connections and processes on any other host
- Retrieve and print a DNS host entry:
#Include"Csapp.h"IntMain(int argc,Char **argv) {Char **pp;struct IN_ADDR addr;struct Hostent *hostp;if (argc! =2) {fprintf (StdErr"Usage:%s <domain name or dotted-decimal>\n", argv[0]);Exit0); }if (Inet_aton (argv[1], &addr)! =0) HOSTP = gethostbyaddr ((const char *) &addr, sizeof ( addr), af_inet); else HOSTP = gethostbyname (Argv[1]); printf ( "official hostname:%s\n", hostp->h_name); for (pp = hostp->h_aliases; *pp! = null; pp++) printf (" Alias:%s\n ", *pp); for (pp = hostp->h_addr_list; *pp! = NULL; pp++) {addr.s_addr = ((struct in_addr *) *pp)->s_addr; printf ( "Address:%s\n", Inet_ntoa (addr));} exit (0);}
Socket interface
- Function:
- Socket function
- Connect function
- OPEN_CLIENTFD function
- Bind function
- Listen function
- OPEN_LISTENFD function
- Accept function
Web server
- The interaction between the Web client and the server is a text-based application-level protocol called HTTP (hypertext Transfer Protocol, Hypertext Transfer Protocol). HTTP is a simple protocol. A WEB client (that is, a browser) opens an Internet connection to the server and requests some content. The server responds to the requested content, and then closes the connection. The browser reads the content and displays it on the screen.
- Web content can be written in a language called HTML (hypertext Markup Language, Hyper-text markup language). An HTML Program (page) contains directives (tags) that tell the browser how to display various text and graphics objects on this page.
- The WEB server provides content to the client in two different ways:
- Takes a disk file and returns its contents to the client. The disk file is called static content, and the process of returning the file to the client is called the service static content (serving static contents).
- Runs an executable file and returns its output to the client. The output generated by the runtime executable is called the content of the state (dynamic contents), while the process of running the program and returning its output to the client is called the service dynamic content (serving).
Concurrent programming
- Concurrency: Logical control flows overlap in time
- Concurrent programs: Applications that use application-level concurrency are called concurrent programs.
- Three basic ways to construct concurrent programs:
- Process, with the kernel to invoke and maintain, there is a separate virtual address space, explicit inter-process communication mechanism.
- I/O multiplexing, an application that explicitly dispatches a control flow in the context of a process. The logical flow is modeled as a state machine.
- A thread that runs a logical stream in a single process context. Dispatched by the kernel to share the same virtual address space.
Process-based concurrency programming
- The natural way to construct a concurrent server is to accept the client connection request in the parent process and then create a new child process to serve each new client.
- Because the connection descriptor in a parent-child process points to the same file table entry, it is critical that the parent process turn off the copy of its connected descriptor, and the resulting memory leak will eventually consume as much of the memory as is available, causing the system to crash.
- The focus of the process-based concurrent echo server:
- A SIGCHLD handler is required to reclaim the resources of the zombie subprocess.
- Parent-child processes must close their CONNFD copies. is especially important to the parent process to avoid memory leaks.
- The reference count in the File table entry for the socket until the CONNFD of the parent-child process is closed and the connection to the client terminates.
- Model of the process: share the File table, but not the shared user address space.
- About the pros and cons of the process:
- Advantage: A process cannot accidentally overwrite the virtual memory of 21 processes.
- Cons: Separate address spaces make it more difficult for processes to share state information. The overhead of process control and IPC is high.
- Unix IPC refers to all the technologies that allow processes to communicate with other processes on the same host, including pipelines, FIFO, System V shared memory, and System V semaphores.
Concurrent programming based on I/O multiplexing
- The echo server must respond to two mutually independent I/O times:
- Network Client initiates connection request
- The user types the command line on the keyboard
- The basic idea of I/O multiplexing technology: Using the Select function requires the kernel to suspend a process and return control to the application only after one or more I/O events have occurred.
- Treat descriptor sets as n-bit vectors: B (n-1), ... b1,b0, each bit BK corresponds to descriptor K, and only if and only if bk=1, descriptor K is an element of the descriptor set. Here are three things you can do:
- assign them;
- Assign a variable of this type to another variable;
- Use Fdzero, Fdset, FDCLR, and Fdisset macros to modify and examine them.
echo function: Send back each line from the sci-fi segment until the client closes the link.
- A state machine is a set of states, input events, and transitions that map states and input times to States, and self-loops are transitions between the same input and output states.
- The design benefits of the event drive:
- More control over program behavior for programmers than process-based design
- Runs in a single process context, so each logical flow can access the full address space of the process, making it easy to share data between streams.
- A process context switch is not required to dispatch a new stream.
- Disadvantages:
- Complex coding
- Cannot take full advantage of multi-core processors
Granularity: The number of instructions executed per time slice per logical stream. The concurrency granularity is the number of instructions required to read a complete line of text.
Thread-based concurrent programming
- Thread: A logical flow that runs in the context of a process.
Threads have their own thread context, including a unique integer thread ID, stack, stack pointer, program counter, general purpose register, and condition code. All threads running in a process share the entire virtual address space of the process.
Threading Execution Model
- Main thread: Each process starts its life cycle as a single thread.
- Peer thread: A peer thread created by the main thread at some point.
- The thread differs from the process:
- The context switch of a thread is much faster than the context switch of a process;
- A thread that is associated with a process makes up a peer pool, independent of the threads created by other threads.
- The main thread differs from other threads only in that it is always the first one running in the process.
- Impact of peer pooling
Thread routines: The code of the thread and the local data are encapsulated in a thread routine. Each thread routine takes a generic pointer as input and returns a generic pointer.
Creating Threads
The pthread create function creates a new thread and takes an input variable, arg, to run thread routine F in the context of the new thread. A new thread can obtain its own thread ID by calling the Pthread _self function.
Terminating a thread
- How a thread is terminated:
- When top the thread of the layer is returned, the line Cheng terminates;
By calling the Pthread _exit function, the thread terminates with a display. If the main thread calls Pthread _exit, it waits for all other peer threads to terminate before terminating the main thread and the entire process.
Reclaim Resources for terminated threads
The pthread _join function blocks until the thread tid terminates, reclaiming all memory resources that have been consumed by the terminated thread. The Pthread _join function can only wait for a specified thread to terminate.
Detach thread
- At any point in time, threads are either associative or detached. A binding thread can be recovered by other threads and killed by another thread, and a detachable thread cannot be recycled or killed by another thread. Its memory resources are automatically released when it terminates.
- By default, threads are created to be associative, and in order to avoid memory leaks, each thread that can be assembled should either be explicitly reclaimed by another process or be detached by calling the Pthread _detach function.
Initializing threads
- The Pthread _once function allows initialization of the state associated with a thread routine.
- The once _control variable is a global or static variable that is always initialized to Pthread _once _init.
A thread-based concurrent server
- Competition is introduced between the assignment statement of the peer thread and the accept statement of the main thread.
Variable sharing thread memory model in multi-threaded programs
- Each thread shares the remainder of the process context with other threads. This includes the entire user virtual address space, which consists of read-only text, read/write data, heaps, and all shared library code and data regions. Threads also share the same set of open files.
Any thread can access the shared virtual storage anywhere. Registers are never shared, and virtual storage is always shared.
mapping variables to storage
- Global variables: The read/write area of the virtual memory contains only one instance of each global variable.
- Local automatic variable: A variable that is defined inside a function but does not have a static property.
Local static variable: A variable that is defined inside a function and has a static property.
Shared variables
The variable v is shared when and only if one of its instances is referenced by more than one thread.
Synchronizing Threads with semaphores
- Shared variables introduce the possibility of synchronization errors.
- The Loop code for thread I is decomposed into five parts:
- Hi: instruction block in the loop head
- Li: Load shared variable CNT to register%EAX instruction,%eax represents the value of register%eax in thread I
- Ui: Update (ADD)%EAX instructions
- Si: Save the updated value of%eaxi back to the instruction of the shared variable cnt
- Ti: the instruction block at the tail of the loop.
Progress map
- A progress map transforms the instruction execution into a transition from one state to another. A transform is represented as a forward edge from one point to an adjacent point. The legal conversion is to the right or up.
- Critical section: For thread I, instructions for manipulating the contents of a shared variable CNT form a critical section.
- Mutually exclusive access: ensures that each thread has exclusive access to shared variables when executing instructions in its critical section.
- Safety track lines: track lines that bypass unsafe areas
Unsafe track Line: A trace line that touches any unsafe area is called an unsafe track line
- Any security trace line can update the shared counter correctly.
Signal Volume
- When there are multiple threads waiting for the same semaphore, you cannot predict which thread the V operation will restart.
- Semaphore invariance: A running program must never enter a state in which a correctly initialized semaphore has a negative value.
Using semaphores to achieve mutual exclusion
- Binary semaphore: Each shared variable is associated with a semaphore S, and then the critical area is surrounded by the P (s) and V (s) operations to protect the semaphore of the shared variable.
- Mutex: A two-dollar semaphore for the purpose of providing mutual exclusion
- Locking: A mutex performs a p operation called a lock on a mutex, and a V operation is called an unlock of the mutex. A thread that has a lock on a mutex but has not yet been unlocked is called an exclusive lock.
- Count Semaphore: A number of counters used as a set of available resources
Using semaphores to dispatch shared resources
- The function of the semaphore: (1) providing mutual exclusion (2) Scheduling access to shared resources
- Producer-consumer issues: producers generate projects and insert them into a limited buffer, consumers take these items out of the buffer and then consume them.
- Reader-Writer questions:
- Readers prefer not to allow the reader to wait unless the permission to use the object has been assigned to a writer.
- The writer takes precedence and asks that once a writer is ready to write, it will do its job as well as possible.
- Starvation is a thread that is blocked indefinitely, unable to progress.
Using threading to improve parallelism
- The Write order program has only one logical stream, and the concurrent program has multiple concurrent streams, and the parallel program is a concurrent program running on multiple processors. A collection of parallel programs is a true subset of concurrent program collections.
Other Concurrency issues thread safety
- Thread safety: It always produces the correct results when and if it is called repeatedly by multiple concurrent threads.
- Thread insecure: If a function is not thread-safe, the thread is unsafe.
- Thread unsafe classes:
- Functions that do not protect shared variables
- A function that maintains the state across multiple calls.
- Returns a function that points to a pointer to a static variable. Workaround: Rewrite the function and lock the copy.
- A function that calls the thread unsafe function.
Re-entry Accessibility
- Reentrant functions: When they are called by multiple threads, no shared data is referenced. A reentrant function is a true subset of thread-safe functions.
- The key idea is that we replace the static next variable with a pointer passed in by a caller.
- Explicit reentrant: No pointers, no references to static or global variables
- Implicit reentrant: Allow them to pass pointers
- Reentrant even if the caller is also the property of the callee, it is not just the individual properties of the callee.
Using existing library functions in a thread-in-line program
- The reentrant version of the thread unsafe function is used, and the name ends with the suffix _r.
Competition
- Competition: Competition occurs when the correctness of a program relies on one thread to reach the X point in its control flow before another thread reaches the y point.
- A threaded program must work correctly on any of the possible trajectory lines.
- Elimination method: Dynamically assigns a separate block to each integer ID, and passes to the thread routine a pointer to the block
Dead lock
- Deadlock: A set of threads is blocked, waiting for a condition that will never be true.
- Programmers use P and V incorrectly so that two of the semaphore's forbidden areas overlap.
- Overlapping forbidden areas cause a set of states called deadlock zones.
- Deadlocks are unpredictable.
This week's code hosting
Resources
- "In-depth understanding of computer system V2" Learning Guide
- 2016-2017-1 the teaching process of the basic design of information security system
- Teaching material reading and weekly exam focus
- Code-driven Programming learning
20145317 "Information Security system Design Fundamentals" 13th Week Study Summary