Big Data Universal processing platform
- Spark
- Flink
- Hadoop
Distributed storage
Hdfs
Resource Scheduling
Yarn
Mesos
Machine learning Tools
Mahout
- Spark Mlib
- TensorFlow (Google Department)
- Amazon Machine Learning
- DMTK (Microsoft Distributed Machine Learning tool)
Data analysis/Data Warehouse (SQL Class)
- Pig
- Hive
- Kylin
- Spark SQL,
- Spark DataFrame
- Impala
- Phoenix
- ELK
8.1 ElasticSearch
8.2Logstash
8.3Kibana
Message Queuing
- Kafka (Pure log class, high throughput)
- Rocketmq
- ZeroMQ
- ActiveMQ
- RabbitMQ
Flow-based computing
- Storm/jstorm
- Spark Streaming
- Flink
Log Collection
Scribe
Flume
Programming languages
- Java
- Python
- R
- Ruby
- Scala
Data analysis and mining
Matlab
Spss
Sas
Visualization of data
- R
- D3.js
- Echarts
- Excle
- Python
Machine Learning
Machine Learning Basics
- Clustering
- Time series
- Recommendation system
- Regression analysis
- Text mining
- Decision Tree
- Support Vector Machine
- Bayesian classification
- Neural network
Machine learning Tools
- Mahout
- Spark Mlib
- TensorFlow (Google Department)
- Amazon Machine Learning
- DMTK (Microsoft Distributed Machine Learning tool)
algorithm
Consistency
- Paxos
- Raft
- Gossip
Data
- Stacks, queues, linked lists
- Hash table
- Binary tree, red black tree, B-Tree
- Figure
Common Algorithms
1. Sorting
Insert Sort
Bucket sort
Heap Sort
2. Quick Sort
3, maximum sub-array
4. Longest common sub-sequence
5. Minimum spanning tree
Shortest path
6. Storage and operation of matrices
Cloud Computing
Cloud Services
- Saas
- Paas
- Iaas
- Openstack
- Docker
End.
Transferred from: http://www.36dsj.com/archives/4520
Source: http://www.ha97.com/5734.html
Big Data architect Skills Atlas