Use excel for data mining (4) ---- highlight abnormal values and excel Data Mining
Use excel for data mining (4) ---- highlight Abnormal Values
After configuring the environment, you can use excel for
With the intensification of market competition, China Telecom is facing more and more pressure, customer churn is also increasing. From the statistics, the number of fixed-line PHS this year has exceeded the number of accounts. In the face of such a grim market, the urgent task is to make every effort to reduce the loss of customers. Therefore, it is necessary to establish a set of models that can predict customer churn rate in time by using data
Business Intelligence product Data mining focuses on solving four types of problems: classification, clustering, correlation, prediction (which will be explained in detail after the four types of questions), while conventional data analysis focuses on solving other data analysis problems, such as descriptive statistics
Spatial Data
Multimedia Data
For example, image data
Description-based retrieval system: keywords, titles, dimensions, etc.
Content-based retrieval system: color composition, texture, shape, object and wavelet transformation.
Time series data and sequence data
Trend Analysis
1. Define the mining target
To understand the real needs of users, to determine the target of data mining, and to achieve the desired results after the establishment of the model, by understanding the relevant industry field, familiar with the background knowledge. 2. Data acquisition and processing of clear
The previous article introduced the open source data mining software Weka to do Association rules mining, Weka convenient and practical, but can not handle large data sets, because the memory is not fit, give it more time is useless, so need to carry out distributed computing, Mahout is a based on Hadoop Cloth
First contact data mining related knowledge, worship Daniel's article, hope to be able to add their own understanding
What is clustering, classification, regression.
Article 1: Data mining commonly used methods (classification, regression, clustering, association rules, etc.), slightly to the conceptual interpretatio
In various data mining algorithms, association rule mining is an important one, especially influenced by basket analysis. association rules are applied to many real businesses, this article makes a small Summary of association rule mining. First, like clustering algorithms, association rule
1 What is data mining?
The most commonly accepted definition of "Data Mining" is the discovery"Models" for Data.
1.1 statistical modeling
Statisticians were the first to use the term "data min
First, data mining
Data mining is an advanced process of using computer and information technology to obtain useful knowledge implied from a large and incomplete set of data. Web Data mining
transaction by user shell+ip+ hostname according to different user's login (all three are the same user) Based on this, the basic principle of mining 2 algorithm for user input command sequence frequent pattern is realized.
The fp-growth algorithm mainly solves the collection of frequent items where the number of occurrences reaches a certain threshold in multiple sets. A FP tree is a compressed representation of input
Purpose of collecting web logsWeb log mining refers to the use of data mining technology, the site user access to the Web server process generated by the log data analysis and processing, so as to discover the Web users access patterns and interests, such information on the site construction potentially useful and unde
only 1. So the count of conditional pattern bases is determined by the minimum count of nodes in the path.Depending on the conditional pattern base, we can get the conditional FP tree for that commodity, for example i5:According to the conditions of the FP tree, we can do a full array of combinations, to get the frequent patterns excavated (here to the commodity itself, such as i5 also counted in, each commodity mining out of the frequent pattern mus
I plan to organize the basic concepts and algorithms of data mining, including association rules Mining, classification, clustering of common algorithms, please look forward to. Today we are talking about the most basic knowledge of association rule mining.
Association rules minin
Tags: using SP data, BS, users, technical objects, different methods
First:
Data type,
Different attributes of an object are described by different data types, such as age --> int; birthday --> date. Different types of data mining must be treated differently.
Second:
rule algorithm---AprioriFirst introduce a few professional nounsMining Datasets: The collection of data to be mined. That's a good understanding.Frequent patterns: Patterns that occur frequently in mining datasets, such as itemsets, sub-structures, sub-sequences, and so on. This is how to understand, in short, mining data
I. Concepts
Association Rule Mining: discovering interesting and frequent patterns, associations, and correlations between item sets of a large amount of data, such as the food database and relational database.
Measurement of the degree of interest of association rules:Support,Confidence
K-item set: a set of K items
Frequency of the item set: number of transactions that contain the item set
Frequent Item Se
Several basic concepts and two basic algorithms for association rules are described in the previous few. But actually in the commercial application, the writing algorithm is less than, understands the data, grasps the data, uses the tool to be important, the preceding basic article is to the algorithm understanding, this article will introduce the open source utilizes the
With the advent of the big data age, the importance of data mining becomes apparent, and several simple data mining algorithms, as the lowest tier, are now being used to make a brief summary of the Microsoft Data Case Library.Appl
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.