The number of on-demand computers has important applications in many fields, such as Monte Carlo simulation, cryptography, and network security. The random number quality is directly related to the reliability and security of the network security system, and to the reliability of the Monte Carlo simulation results. Since the birth of the computer, it has been a long-term topic to seek to use the computer to produce high-quality random number sequences. Linux Kernel version 1.3.30 has developed a high-intensity random number generator. Based on the source code of Linux 2.6.10 kernel, this article analyzes in detail the design and implementation of the random number generator.

1. Basic Principles

Linux Kernel uses entropy to describe the randomness of data. Entropy is a physical quantity that describes the disorder and disorder of a system. The larger the entropy of a system, the worse the ordering of the system, that is, the greater the uncertainty. In informatics, entropy is used to characterize the uncertainty of a symbol or system. The larger the entropy, the less useful information the system contains, and the greater the uncertainty.

The computer itself is a predictable system, so it is impossible to use computer algorithms to generate real random numbers. However, the environment of the machine is filled with various kinds of noise, such as the time when the hardware device is interrupted, and the time interval when the user clicks the mouse is completely random, which cannot be predicted in advance. The random number generator implemented by the Linux kernel uses the random noise in the system to generate high-quality random number sequences.

The kernel maintains an entropy pool to collect environmental noise from device drivers and other sources. Theoretically, the data in the entropy pool is completely random and can generate a sequence of real random numbers. To track the randomness of the data in the entropy pool, the kernel adds the data to the pool to estimate the randomness of the data. This process is called entropy estimation. The entropy estimation value describes the number of random numbers in the pool. A greater value indicates a better randomness of the data in the pool.

2. Design and Implementation

The random number generator in Linux kernel is implemented as a character device in/Drivers/Char/random. C. Call create_entropy_store () in the module initialization function rand_initialize () to create the default entropy pool named random_state, sec_random_state and urandom_state respectively. The entropy pool is represented by struct entropy_store.

The kernel implements a series of interface functions to obtain the noise data of the system environment and add them to the entropy pool:

Void add_interrupt_randomness (int irq );

Void add_keyboard_randomness (unsigned char scancode );

Void add_mouse_randomness (_ u32 mouse_data );

Void add_disk_randomness (struct gendisk * disk );

The add_interrupt_randomness () function adds random data to the entropy pool by using the interval between two device interruptions as the noise source. To interrupt a device as a system noise, you must use the SA_SAMPLE_RANDOM sign to register the disconnected service program. In this way, the system will automatically call add_interrupt_randomness () to add entropy to the entropy pool whenever the device is interrupted.

Add_keyboard_randomness () uses the scan code of the key and the time interval between the two buttons as the noise source, while add_mouse_randomness () uses the mouse position and the time interval of two consecutive mouse interruptions to fill the entropy pool; the add_disk_randomness () function generates random numbers at intervals between two consecutive disk operations.

The above functions add entropy to the entropy pool by calling the add_timer_randomness () function. Add_timer_randomness () First estimates the entropy of the added data, and then calls the batch_entropy_store () function to add the data to the entropy pool. To avoid the impact of long interruption delay on system performance, batch_entropy_store () does not directly add entropy to the entropy pool, but adds it to the queue. When the queue length reaches a certain length, the keventd kernel thread adds the entropy in the queue to the pool by calling the batch_entropy_process () function.

The Batch_entropy_process () function enumerates each entropy in the queue and calls the add_entropy_words () function for each entropy to add it to the entropy pool. However, it does not update the entropy estimation value of the entropy pool. Therefore, after batch_entropy_process () calls add_entropy_words () for each entropy, it immediately calls the credit_entropy_store () function to update the entropy estimation value.

Compared with the input interface, the Linux kernel also implements a series of output interfaces to output random number sequences to user space or other modules in the kernel.

The void get_random_bytes (void * buf, int nbytes) function is used to output random numbers to other kernel modules. It returns the random number sequence of nbytes bytes from the entropy pool and stores them in the buf. This function first returns a random number from the urandom_state pool. If the urandom_state entropy pool does not exist, data is returned from the sec_random_state pool. the random number is returned from the default pool random_state only when neither of the previous two exists. Get_random_bytes () always returns the random number sequence of nbytes bytes, even if the entropy estimation value of the entropy pool is 0.

In addition, the kernel provides two character devices:/dev/random and/dev/urandom. Their read functions (random_read () and urandom_read ()) this parameter is used to input a random number sequence to the user mode program. Upper-layer applications can call the read system to obtain random number sequences from them. Compared with the/dev/urandom interface, the random sequence columns output by/dev/random have higher quality and are suitable for high-intensity encryption algorithms. /Dev/urandom always returns the requested random number sequence, regardless of whether the entropy estimation value of the entropy pool is zero or not, And/dev/random returns only the longest random Number Sequence allowed by the entropy estimation value, when the entropy estimation value is zero, the request will be blocked until the entropy estimation value increases to a certain value.

The above output interfaces finally output the random number sequence by calling the extract_entropy () function. The Extract_entropy () function uses the SHA or MD5 Algorithm Hash entropy pool to output the Hash result to the user as a random number sequence, this avoids direct access to the content in the entropy pool. Because the probability of reverse pushing the original data from the SHA or MD5 algorithm is almost zero, this design greatly improves the security. Attackers cannot directly access the entropy pool or predict future Sequences Based on the series of past random numbers.

When the system starts, the START process is a definite and predictable process. In this case, the entropy value of the entropy pool will be very small, resulting in a reduction in the quality of the random number sequence, this may cause attackers to crack the attack. To overcome the impact of the predictability of the system startup process, the Linux operating system saves the content of the current entropy pool when the system is shut down, when the system starts the next time, it restores the entropy pool data when the last shutdown, which effectively increases the entropy estimation value of the entropy pool and avoids downgrading of the random number sequence quality.