Big data is so capricious. First-quarter data structures and algorithms (front-line experience, authoritative information, knowledge fresh, practical, full source)

Last Update:2016-04-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This course is a basic course for Big data engineers and cloud computing engineers , as well as a course that all computer professionals must master.

Without mastering data structures and algorithms, you will find it difficult to master efficient, professional processing tools, and more difficult to handle complex large data processing scenarios.

Consider the following questions:

1, social networking sites (such as Weibo, Facebook), the relationship between people is a huge amount of data, how do you study and deal with this problem?

2. What is the index function of the database? Why organize indexes using data structures such as hashes, B + trees, and heap tables?

3, why Linux virtual Memory management module, the use of red and black trees to deal with VMA search?

4. Why can search engines return search results in milliseconds?

5, how do you design the city road, to ensure that the minimum cost can be achieved throughout the city connectivity?

If you are still confused about the above questions, or if your plan is specious, then this course is for you.

In this course, you will not only answer the above questions, you can also answer:

1. Why HBase uses the bloomfilter algorithm to deal with the problem of whether the block is already cached.

2. Why the concept of tree and node is used in zookeeper to describe the dependence and coordination of distributed system.

3. Why LEVELDB uses the jump table and LSM tree structure to optimize performance.

In addition, many of the classical ideas in data structures and algorithms are well worth understanding and useful for those who have a strong interest in the computer industry.

First, the curriculum development environment

Operating system: Linux CentOS 7

Ide:intellij Idea 14

Main references: Princeton algorithm 4th edition English version, algorithm introduction 3rd edition English version

Other references: Linux kernel source code, JDK source code, wiki English station, etc.

Description Language: Java

Ii. Introduction to the contents of the course

The importance of data structure and algorithm in the field of computer science and it is self-evident.

It is not only the computer professionals should master a basic course, but also engaged in databases, data processing practitioners should be proficient in a technology.

This course is designed for big data engineers and cloud computing engineers with the following features, which are often too theoretical, practical, knowledge, and case-new in the course of data structure in universities:

1. Emphasize the application of the project, try to avoid the mathematical symbol description, but when the use of mathematical symbols to describe the more strong semantic use and do a detailed explanation.

2. A variety of data structures, highlighting the actual needs of the project, from the practice and successful use of cases (such as operating systems, databases, large database processing framework, micro-blog, etc.), to guide the use of data structures, accurate positioning of the value of data structure, and strive to enable students to the knowledge landing, apply.

3. For difficult to understand the algorithm and some extremely important ideas, such as recursion, divide and conquer the strategy, the use of PPT illustrations decomposition steps, PPT sketch explanation, pseudo-code description explanation, source code comment explanation, source code single-step debugging and tracking means, so that students can understand the algorithm, grasp the algorithm, the use of algorithms.

4. In order to ensure the professionalism of the cited knowledge and take into account the actual large data processing company's daily research and development status, the use of reference materials mainly for the international well-spoken English books, papers, senior or self-developed blog, etc., and with the Chinese interpretation, and strive to master the best possible professional knowledge.

5. The whole source code, highlighting, considering the proficiency of students may vary, so the use of the popular language in Java to describe and write codes, so that all students can read and learn.

Third, the main content of the course:

1. Data structures and Algorithms overview

2. Arrays, lists, queues, stacks and other linear tables

3. Recursive and non-recursive traversal of two tree, BST, AVL tree and binary tree

4.b+ Tree

5. Skip the table

6. Diagram, graph storage, graph traversal

7. Graph, graph, lazy and positive premium Manaus algorithm, Kruskal algorithm and MST, single source shortest path problem and Dijkstra algorithm

8. and search set and indexed priority queue, binary heap

9. Genetic algorithm preliminary and TSP problem

10. Internal sorting (direct insertion, selection, hill, heap sorting, quick-row, merge, etc.) algorithm and optimization in practice

11. External Sorting and optimization (file encoding, data encoding, I/O mode and JVM features, multithreading, multi-thread merging, etc.)

12. Hash table, Trie tree, inverted Index, distributed index Preliminary (map-reduce)

Lecturer Hao:

He has studied in Zhong Ke and CAs, and is familiar with the development, architecture, design and optimization of service-side, distributed system and big Data processing framework.

Senior Development engineer, Big Data engineer.

First, Introduction

1th: What is a data structure?

2nd: What is an algorithm?

Second, linear table

3rd: Linear tables (arrays, linked lists, queues, stacks)

4th: Linux Work queue and JDK thread pool

Three, the tree

5th: Nonlinear structure, tree, binary tree

6th: Balance tree, AVL tree

7th: B + Tree and database index

Iv. Fig.

8th: The concept and storage of graphs

9th: The Traversal of graphs

10th: Minimum Spanning tree (MST), prim algorithm, Kruskal algorithm

11th: Single source shortest path and Dijkstra algorithm

12th: Approximate solution of TSP by genetic algorithm

Five: Sort

13th: Select Sort, insert sort, hill sort

14th: Heap Sort, priority queue

15th: Quick Sorting and optimization

16th: Merging Sorting and optimization

17th: Merge sort and external sort

18th: Optimization and extension of external sorting

Six: Find

19th: Hash table, binary lookup, trie tree, Ternery tree, search engine and inverted index, centralized index and distributed index, Map-reduce preliminary

1. Mastering data structures and algorithms used in the practice of data processing

2. Train data Processing thinking

3, Training algorithm realization ability

4. Develop a vision to understand the position and value of data structure and algorithm in operating system, Internet, database, mass data processing scene

5, knowledge landing, learn to use data structure and algorithms and related knowledge to analyze practical problems, the ability to solve practical problems

6, for deep, comprehensive, solid grasp the big data processing technology lay the foundation

Big data is so capricious. First-quarter data structures and algorithms (front-line experience, authoritative information, knowledge fresh, practical, full source)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Big data is so capricious. First-quarter data structures and algorithms (front-line experience, authoritative information, knowledge fresh, practical, full source)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Big data is so capricious. First-quarter data structures and algorithms (front-line experience, authoritative information, knowledge fresh, practical, full source)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support