Recently Linus proposed parallel technology is not feasible, I carefully analyzed the Chinese and Australian Sinox parallel technology, the Chinese-Australian matrix computer gradually clear up, and may soon become a reality.
Parallel technology
Parallelism means that at the same time, multiple instructions are executed simultaneously on multiple processors.
Matrices generally refer to arrays, 2 of the n-th square. 2 of the 10 is 1024, because data processing is binary. A matrix computer can be understood as a parallel computation of 2 of the n-th compute units.
Parallel Computing (Parallel Computing) is the process of solving computational problems by using multiple computational resources simultaneously, which is an effective method to improve computing speed and processing ability of computer system. Its basic idea is to use multiple processors to solve the same problem together, the problem will be solved into a number of parts, each part by a separate processor to parallel computing.
Multiple processors, there is now a processor multiple cores, equivalent to putting a lot of single-core processors together. There are 2 cores, 4 cores, 8 cores, and 64 cores.
The personal computer is now using 18 cores.
The matrix computer should have hundreds of or even thousands of cores. The current operating system already supports hundreds of cores.
The core idea of parallel computing is the decomposition of computation, and then the decomposition of computation is allocated to a separate core execution.
Modern operating system using the minimum CPU is a thread, our parallel computing implementation is to break down the task into the thread, and then let the operating system execution, after the completion of the execution of the thread, the processing results can be merged to calculate, the final result.
Thread run involves synchronization, sharing, mutex, etc. can be done by software, we do not need to consider instruction-level parallelism. Instruction-level parallelism is handled by the CPU, such as multi-instruction flow, disorderly execution, and so on.
Concurrency means that at the same time, only one instruction can be executed, but multiple process instructions are executed quickly, making it possible to have multiple processes concurrently executing at the macro level.
The server software processes a large number of requests, and the thread pool completes concurrent processing. Concurrency is a thread that uses the same core for alternating execution. But on multicore servers, different threads may run at different cores, implementing parallelism.
Common procedures to take advantage of multi-core parallel computing method is to break down the task into multiple threads. The more cores, the more threads are calculated faster.
Chinese-Australian matrix computer
Matrix computing is a computer with hundreds of or even thousands of cores that implements high-speed parallel computing.
Linus said that the computer plugs into so many cores, in fact, a CPU can be put into a lot of cores, now has dozens of core of the cpu!
And the thousand-core CPU has already begun to develop, the thousand-core processor appears, may start very expensive, but later may be as cheap as now CPU.
Chinese-Australian matrix computer as long as the 4个千 core CPU has 4,000 cores. Obviously the size of the computer will not be larger, and hundred core mobile phone volume is not much larger than the current phone.
I originally designed the Chinese-Australian matrix computer is the eight-core CPU, need 100 CPUs, really make the computer bigger, but the thousand-core CPU appears, the Chinese-Australian matrix calculation will not become larger.
Matrix operating System
Han o sinox64-bit operating system, the introduction of high-performance high-reliability file storage System ZFS, coupled with operating system virtual machine jail, can run hundreds of virtual operating systems on a single machine, to achieve operating system-level parallel computing.
ZFS uses software to make several hard disks into a RAID array so that a disk's damage data is not lost.
Sinox Rock-Solid high-reliability network processing, can meet the needs of high-performance matrix computers to meet the needs of artificial intelligence computing.
AI requires high-speed computing
Now people work Smart development in two directions are unmanned aerial vehicles and unmanned driving technology. At present, the normal computer computing speed is not enough.
Say Smart car, if there are 100 video input, the realization of video recognition needs a lot of calculation, the current general computer can only deal with a camera recognition data, more than 10 estimates to cope with.
100 need 10 ordinary computers, obviously the car installed 10 computers is impossible. But thousand-core computers may be able to handle thousands of camera input video recognition. So high-speed computers can meet the needs of intelligent vehicle computing. The drone camera range is larger, with a few square kilometres of image capture. At present, the eagle's eyes can see the ground thousands of meters away from the animals, computing power is very strong. In addition to video processing, there are radar data processing, sensor data processing, control systems.
Parallel Computing Method
1. Multithreading Parallel Computing
Threads can theoretically create unlimited threads by taking advantage of the minimum operating unit of the CPU, but because too many threads switch too much on a CPU to spend a lot of money, the calculation is slow. Therefore, the operating system has the maximum number of threads.
A process is an executable program that can have multiple threads. Therefore, the number of processes is less than the number of threads. If the system has 1000 cores, the system runs less than 1000 threads, they are executed concurrently, if more than 1000 threads are running, 2 processes occupy the same CPU core, enter the concurrency state and need to switch threads. In fact, the thread may need to wait for IO operation, so the thousand-core computing data does not necessarily increase 1000 times times. However, in the program design, a single application can only request 2G space on a 32-bit system, the maximum non-paged memory of 64-bit Windows7 should be "75% of RAM up to a maximum of GB", the visible memory is also not support wireless thread.
The thousand-core computer should be configured with 2000 g of memory, equivalent to a core with 2G of memory.
Parallel processes are more expensive than threads, and are not recommended.
2. Server software implements thread pool high-speed concurrency
Currently the Web server Apache cannot carry 1000 concurrent threads, and the thread pool is limited. However, the computational speed can be greatly improved by using the Server scripting language for concurrency calculation.
With server technology, computing tasks are sent to multiple servers for high-speed computing, similar to distributed computing, but using scripting languages can be cumbersome, but can also be used in certain situations.
3. Parallel computing with parallel computing servers
An application that is more than like processing, has 100 camera access, we want to know that there is someone in and out of that camera and then the police.
We can use a single program to set up 1000 threads to handle, just do not know whether stable.
If a parallel server is established, the defined data and algorithms are sent to the server operations, such as a video input data is divided into 10 parts, the algorithm is to identify the portrait, the parallel server software to run, the use of similar server technology, just send not script, but input data and algorithm code, this code may be binary, Intermediate language, or it may be a scripting language. This calculation is more efficient than the normal server calculation, and the return data is also easy to receive, returning directly to the calling function space. Since the data are all in one machine memory, the transfer will be quick.
Parallel server can be built on virtual operating system, with jail technology, running hundreds of virtual operating systems, each virtual operating system running a parallel computing server, Parallel Computing server maintains a thread pool, but also can calculate radar data, sensor data and so on.
Chinese-Australian Sinox Parallel matrix computer
The Chinese-Australian Sinox parallel matrix computer with thousands of cores running high-speed parallel computing of Sinox operating system will be the basis of artificial intelligence.
Although mainframe computing is fast, it's not very practical, because it's too big. Chinese-Australian matrix computer is the same size as the PC but can enter the people's home.
Sino-Australian Sinox parallel matrix computer, software revolution has begun.
Chinese-Australian Sinox Parallel matrix computer