Hadoop mahout Data Mining Practice (algorithm analysis, Project combat, Chinese word segmentation technology)
Suitable for people: advanced
Number of lessons: 17 hours
Using the technology: MapReduce parallel word breaker Mahout
Projects involved: Hadoop Integrated Combat-text mining project mahout Data Mining tools
Consulting qq:1840215592
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/58/0C/wKiom1SoyrPxdZB4AARki2S-S8w795.jpg "title=" Hadoop Mahout.png "alt=" Wkiom1soyrpxdzb4aarki2s-s8w795.jpg "/>
Course Introduction
This course covers the following topics:
1. Mahout Data Mining Tools
2, Hadoop implementation of the comprehensive recommendation system, involving the mapreduce, pig and mahout comprehensive combat
Courses for people
1, this course is suitable for a certain Java basic knowledge, database and SQL statements have a certain understanding of the skilled use of Linux system technical staff, especially for those who want to change jobs or seek a high-paying career
2, preferably have greenplum Hadoop, Hadoop2.0, YARN, Sqoop, Flumeavro, Mahout and other Big Data Foundation, learn the North wind course "Greenplum Distributed database development Introduction to Mastery", " Comprehensive in-depth greenplum Hadoop Big Data analysis platform, "Hadoop2.0, yarn in layman", "MapReduce, HBase Advanced Ascension", "MapReduce, HBase Advanced Promotion" for the best.
Course Outline
Mahout Data Mining Tools (10 hours)
Data mining concepts, system composition
Common methods and algorithms for data Mining (regression analysis, classification, clustering, etc.)
Data Mining analysis tools
Mahout supported Algorithms
Mahout origin and characteristics
Mahout installation, configuration and testing
Actual combat: Mahout K-means Cluster analysis
Mahout implementation of canopy algorithm
Mahout Implementation Classification algorithm
Actual combat: Mahout Logistic Regression classification prediction
Actual combat: Mahout naive Bayesian classification
Concept and classification of recommendation systems
Concept, classification and application of collaborative filtering recommendation algorithm
Actual combat: Implementation of Mahout-based film recommendation system
Hadoop Integrated Combat-text mining project (7 hours)
The concept of text mining and its application scenario
Project background
Project Flow
Chinese Word segmentation technology
The use of Cook looked through word breaker
Design and implementation of MapReduce parallel Word segmentation Program
Pig Partition Data Set
Mahout constructing naive Bayesian text classifier
Model application-Calculating user preference categories
Hadoop mahout Data Mining Video tutorial