After tossing the crawler and some interesting content, I recently in the R language for simple machine learning knowledge, the main reference is "machine learning-Practical Case Analysis" this book.
This book is a rare, purely r language-based machine learning knowledge, covering 11 cases. Divided into 12 chapters. Both the author's notes and the code sections are quite deep. But perhaps because the book earlier, in the data processing aspect, he uses more is the PLYR package, but I use down, the Dplyr package effect is better. So many of the code involved in data processing can actually be rewritten in a more concise way. But the idea is the essence of real deal.
I had eaten the first three chapters and two cases before on a long-distance train. But the more backward reading, the more the more complex, more obscure, more time needed to digest, so pause, the first two cases to rationalize, digest the structure point.
The case data and code in the book can be downloaded to the official GitHub address Https://github.com/johnmyleswhite/ML_for_Hackers
Case 1: American UFO Watch
The case is a data set containing eyewitness records and reports of more than 60,000 UFOs. It is necessary to answer the question of whether there are periodic laws and regional laws of UFOs. Mainly related to the data cleaning process.
After studying, I draw a flowchart such as:
Case 2: The dichotomy method to discriminate spam messages
The case is a mail from Spamassasin, it is divided into spam spam, easy to identify the normal mail easily ham, difficult to identify the normal mail hard ham three types. The purpose of the case is to make a classifier that can quickly identify the type of message by word frequency characteristics such as HTML.
The naive Bayesian taxonomy is used.
Draw the flowchart and notes to see below:
The flowchart is drawn with Visio 2013. Like its hand-painted wind flow chart, before you want to try other flowchart software, compare down, or Visio best use ah ...
Next month's goal
1) Financial Time series
2) machine Learning 4-7 Chapters
R Language Learning notes-machine learning 1-3 Chapters