1. Hadoop, Hive, Sqoop, Spark, Storm, ODPs, Dremel, HBase (Hadoop, spark important)
2, Oracle, MySQL background development, as well as the volume of sea data processing, high concurrent request processing
3, familiar with Linux,shell or Python and other languages
4, the Internet industry data mining
5, distributed, multi-threaded and high-performance design and coding and performance tuning (important)
6, familiar with basic Internet protocol (such as TCP/IP). HTTP, etc.) content and related applications
7. Design mode, transaction processing, caching framework, search engine, task debugging, Web Service, HTTP, picture server
8, familiar with the relevant computational learning algorithms, including Bayesian, Random tree, neural network, good data structure, including tree and graph related calculations
9, familiar with full-text search technology, such as Elasticsearch, Lucene and other uses
About the knowledge that big data engineers need to know