The previous article introduced the ARFF format, which is a proprietary WEKA format. Generally, We need to extract or obtain data from other data sources. WEKA supports conversion from CVS or from databases. The interface is shown in figure
The WEKA installation directo
Several basic concepts and two basic algorithms for association rules are described in the previous few. But actually in the commercial application, the writing algorithm is less than, understands the data, grasps the data, uses the tool to be important, the preceding basic article is to the algorithm understanding, this article will introduce the open source utilizes the
house has been inserted.Listing 3. housing prices using regression models
sellingPrice = (-26.6882 * 3198) + (7.0551 * 9669) + (43166.0767 * 5) + (42292.0901 * 1) - 21661.1208sellingPrice = 219,328
However, looking back at the beginning of this article, we know that data mining is not just about outputting a value: it is about recognition patt
Brief introduction
In the two articles before the "Data mining with WEKA" series, I introduced the concept of data mining. If you haven't read data mining with
Brief introduction
What is data mining? You will ask yourself this question from time to again, because this topic is getting more and more attention from the technical circles. You may have heard that companies like Google and Yahoo! are generating billions of of data points about all their users, and you wonder, "What do they want all this information for?" "Y
Weka looks like this when it's open. 4. Allow Weka to recognize kanji Locate the installation address for Weka (the above step is installed in "D:\Weka-3-6") and locate the file Runweka.ini Open this file, the "fileencoding=cp1252" with the # comment out, enter "Fileencoding=utf-8" in the following can be Now create
I personally think we can directly discuss data mining.AlgorithmAnd WEKA are too impatient to use. I learned data mining methods directly from the beginning. Some methods are difficult and boring. What I often think about is not the method itself, but "What is this ?".
After WEKA
Brief introduction
In data mining with WEKA, part 1th: Introduction and regression, I introduced the concept of data mining and free open source software Waikato Environment for Knowledge Analysis (WEKA), which can be used to min
Today introduces a book, "Data Mining R language combat." Data mining technology is the most critical technology in the era of big data, its application fields and prospects are immeasurable. R is a very good statistical analysis
process statistics, analyze and visualize data. Through various examples, the reader can learn the core algorithm of machine learning, and can apply it to some strategic tasks, such as classification, prediction, recommendation. In addition, they can be used to implement some of the more advanced features, such as summarization and simplification. I've seen a part of this book before, but the internship in
1. The data analysis (Douban) book is quite simple. The basic content is involved, and it is clear. Finally, we talked about R as a plus.Difficulty level: very easy.2. Beer and diapers (Douban) are the most typical cases.Difficulty level: very easy.3. The beauty of data (Douban) An introductory book, each chapter solve
(written in front) said yesterday to write a machine learning book, then write one today. This book is mainly used for beginners, very basic, suitable for sophomore, junior to see the children, of course, if you are a senior or a senior senior not seen machine learning is also applicable. Whether it's studying intelligence or doing other things, machine learning is a must. You see GFW all use machine study,
Trust me, you'll like him.
This is a book for learning basic data mining knowledge. Most of the books on data mining focus on theoretical knowledge, which is difficult to understand and daunting. Don't get me wrong, these theoretical knowledge is still very important. But
model parameters are adjusted to the optimal values. Generally, some techniques can solve the same kind of data mining problem. Some techniques have special requirements for data formation, so it is often necessary to jump back to the data preparation phase.Evaluation (Evaluation)At this stage, you have developed a hi
The idea of self-taught machine learning is really because of my interest in data mining, because in my heart I have always believed in the logic that there is a certain pattern behind everything, and that different situations only correspond to certain conditions. So to find such a pattern is the most convenient and quickest way to solve a class of problems, as a lazy person like me, of course, I would lik
spatial attributes, such as location or region. There are many examples of spatial data, such as collecting meteorological data from different places. An important feature of spatial data is space.Spatial autocorrelation)That is, the physically close objects tend to be similar to other aspects.Process non-recorded dataMost d
other words, a good anomaly detector must have a high detection rate and a low false positive rate. Applications include network attack detection and fraud detection.
Answers to exercises in Data Mining
IntroductionThis book comprehensively introduces the theories and methods of data
very important. It allows you to develop and expand new mining algorithms. In this regard, WEKA (idmer: Almost representative of open-source data mining software) provides a comprehensive documentation of Java functions and class libraries, which is very suitable for expansion. Of course, you must first fully understa
H. Witten/Eibe Frank's practical machine learning technology for data mining.
Collective intelligent programming is suitable for programmers who want to learn about data mining technology. This book describes many practical algorithms in
Read "Data Mining Technology (third edition)"-Thoughts on marketing, sales and customer relationship management
This book is not a purely data mining theory book, you can probably guess from the subtitle of this
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.