Queue insertion <Doubanclaim64ea944f8164f0e1
The characteristics of computing tasks are as follows:
1. Large computing workload and small data volume
2. Large data volume, relatively simple computing
3. Large data volume and large computing workload
Common workloads include:
1. Log Analysis, Pb-level
2. offline analysis, business intelligence, heavy data volume, TB level
3. Investigation Analysis, response speed, less than GB
4. Financial computing, Monte CarloAlgorithm, Large computing workload
Common distributed computing frameworks:
1. hadoop is a map reduce framework centered on distributed file systems. It is good at large data volumes, high latency, and high Io overhead.
2. gridgain, a distributed computing framework with a memory database as the core. It is good at computing, low latency, and low Io overhead.
There are three types of computing structures:
1. SMP
2. NUMA
3. Distributed Computing
The latency increases in exchange for an increase in the total computing capacity. Io is the main constraint. I/O is the first problem in computing.
There are four steps to solve the problem:
1. Single thread
2. parallelization
3. Distributed
4. Platform-based
The main purpose of parallelism is to break through the Single-core computing capability limit.
The main purpose of distribution is to break through the computing capacity limit of a single machine
The main purpose of platformization is to break through the single-purpose capability limit
Fundamental challenges of Parallelism:
1. Task splitting
2. Task Scheduling
Focus on algorithm logic
Main Problems of Parallelism:
1. Resource Competition
2. Data isolation
3. Data visibility
4. Hunger, deadlock, and live lock
Fundamental challenges of distribution:
1. High latency between computing nodes
2. Lack of management roles such as OS after distribution
After parallelization solves algorithm problems, distribution is mainly used to overcome physical limitations.
Main problems of distribution
1. deadlock and hunger are more likely to occur.
2. Topology Management
3. heterogeneous environments
4. Fault Tolerance Mechanism
5. Distributed Load Balancing
6. Storage Capability sharing
7. computing capability sharing
8,CodeDeployment and preparation
9. Cluster Monitoring and Management
Fundamental challenges of platformization: business and political issues
Platform-based problems:
1. Unified Computing Abstraction
2. Unified Data abstraction
3. Heterogeneous Data Processing
4. Business priority assurance
Fundamental challenges of platformization
From the implementation perspective, three layers of problems need to be considered:
1. computing process
2. multi-host computing
3. Single-host computing