Application of cluster analysis in user classification
Source: Internet
Author: User
KeywordsClustering variables we different these
What is cluster analysis? Cluster analysis is a kind of exploratory data analysis method. In general, we use cluster analysis to group and classify seemingly disordered objects to better understand the object of study. The clustering result requires a higher similarity of objects in the group and lower similarity among groups. In user research, many problems can be solved by clustering analysis, such as the information classification of the website, the click behavior of the Web page and the problem of user classification, etc. Among them, user classification is the most common situation.
What is the basic process of clustering analysis?
Selection Clustering analysis to find out the important characteristics of various users cluster interpretation & naming
In the design of the questionnaire, we will be based on certain assumptions, as far as possible to select the product use behavior has an impact on variables, these variables generally contain products closely related to user attitudes, views, behavior. However, the clustering analysis process has some requirements for the variables for clustering:
These variables have significant differences in the values of different research objects;
Because, first of all, the number of variables used for clustering is not the more the better, there is no significant difference in the clustering of variables do not have a real meaning, and may make the result of deviation; second, highly correlated variables are equivalent to weighting these variables, which amplifies the role of a factor in the classification of users.
Methods for identifying appropriate clustering variables:
to the variables of the cluster analysis, from the various types of gathered to select a representative variable, master component analysis or factor analysis, to produce new variables as clustering variables.
|| Cluster analysis
Compared to the preparation before the clustering, the real implementation process appears to be unusually simple. When the data is ready, throw it into the statistical software (usually SPSS) and run around, and the results come out.
One of the problems encountered in this is, how to divide the user into the appropriate? In general, you can combine several criteria to determine:
look at the inflection point (the hierarchical clustering will come out of the aggregation coefficient map, such as the right figure, the general choice of a number of inflection points near the category) by experience or product characteristics to judge (different product user differences are also different) logically can clearly explain the
|| Identify the important features of various types of users
After identifying a classification scheme, we then need to return to observe the performance of various types of users on each variable. According to the result of the difference test, we distinguish the level of different users in this index by Color region. As shown below, the red represents "far above average", yellow for "average", and blue for "far below average". Other variables, and so on. Finally, we find that different categories of users differ from other categories of users of the important characteristics.
|| Cluster Interpretation & Nomenclature
In understanding and interpreting user classifications, it is best to combine more data, such as demographic data, functional preference data, and so on (pictured below) ... Finally, choose the most obvious characteristics of each category for its name, it is done!
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.