* The content of this article by the Ferry, Sword Rainbow, and the author jointly provided, the author summarized
TB has recently been added to the user feedback portal in key product and shopping processes (Figure 1), and various official forums have been set up to collect user comments and suggestions, so a large amount of plain text data will be collected. How do you analyze these data to guide product improvement? This article will describe the theoretical basis and practical application of content analysis (content-analysis) in the processing of network text data (e.g., user feedback). Including: The content analysis method brief, the enterprise Application Content analysis procedure and the skill.
Figure 1: Taobao user feedback portal
The theoretical basis of PART1: A brief introduction to content analysis
Content analysis, refers to the use of clear coding rules, a large number of text information into quantitative data, and attributed to a number of categories to analyze information characteristics of the method. Content analysis has three features:
1 objectivity. Content analysis is not affected by subjective prejudice, there is a standardized research process, the researcher is open to the result;
2) system. Content analysis Process (sampling, analysis, coding, etc.) have unified, Standard rules and procedures;
3 qualitative and quantitative combination. Content analysis through qualitative research to identify the characteristics of the essence of the content, but also to convert the text into quantitative data, analysis results available frequency, percentage, or correlation coefficient, etc. to express.
This kind of research does not need direct contact with the object of study, it is widely used in social science such as communication, information science and pedagogy, but the aim is different. In view of the application types described in this paper, it is more inclined to define the purpose of content analysis from the viewpoint of informatics: to understand the essential facts and trends by analyzing the content, and to reveal hidden problems. In other words, in the user feedback analysis, the content analysis method is used to classify the feedback information systematically, to organize the problem and to evaluate its severity, so as to implement the improvement.
The basic process of content analysis is shown in Figure 2. In the research, there are the steps of training coder (5, 6), measuring the confidence between the coder (6, 7). In the enterprise environment due to time and other research cost constraints, can be selective implementation of these two steps. The implementation of each step is detailed in the PART2.
Figure 2: Content analysis flowchart
PART2 Practical application: Steps and Techniques
As mentioned above, in the enterprise's user research, the content analysis is used to deal with the text user feedback data, the ultimate aim is to promote the improvement by summarizing the inductive problem scientifically and effectively. So the steps and techniques of applying text analysis have the following characteristics:
1 The Coder (user researcher) needs to have a good understanding of the features of the product/project;
2) Sampling point of time, depending on the characteristics of the product/project, if there is any significant improvement;
3) Classification and the establishment of the follow-up data analysis dimension are completed jointly by the researcher and the key members of the project team such as product manager;
Now according to the analysis process described above, take the user feedback analysis in the week after the promotion of Wang Pu as an example, introduce the practical steps of content analysis in enterprise user research.
"The first step is to make an accurate definition of the objectives and scope of the analysis"
Analysis Objective: Collect and summarize the user experience problem and user's view
Analysis Scope:
subject area--a product, or a process of user feedback as a range of time--for example, a week after the revision, the new line a week later ... And so on. It is noteworthy that in product events (such as release, revision) time period, the collection of problems will be more, user feedback polarization is obvious. In the System big event (down machine and so on) time period, the feedback question will be more concentrated, the user feedback is more to complain.
Present in the user research report, which is a short statement, such as: a summary of user feedback in the week following the promotion of Wang Pu
"Step two determines sample samples"
Includes decisions on the following three areas
1 content Source: Choose from where (gangs?) Questionnaire? ... , etc.) sampling
2 time: Select which days in the analysis period (whole paragraph?) Next? Every other week? ) Sampling
3) Content: Random or isometric sampling of content
This means that researchers have a good understanding of the product cycle. In this case, our sample samples are all the data submitted by the user through the "suggestions" portal (Figure 3) within one week (7 days) after the promotion of Wang Pu.
-->
Figure 3: Wang Shop user feedback collection entrance
"Step three determines the unit of analysis"
The analysis unit is the smallest element in the content analysis, which gives a clear and clear operational definition. In this case, the unit of analysis is an opinion stated by a user. It is noteworthy that it is not in the case of a single submission that a user may submit a number of comments at once, and that the views need to be taken apart, each being an analytical unit. As shown in Figure 4, "Delete module deleted can not add", "released and back to decorate the page" is a unit of analysis.
Figure 4: Example of an analysis Unit
"The fourth step establishes the classification"
Classification is the key step of content analysis, which determines the validity of subsequent quantitative analyses. Categories are the basic units of content analysis, and each analysis unit can and can only be grouped within a category. In academic research, it is often classified according to research theory or reference to past research results. In the enterprise, we should have a good understanding of the products, and work with the product manager to develop a classification framework. Because this facilitates the allocation of problem-solving principals.
Exhaustion and mutual exclusion are the two major principles of classification. Classifications should cover a range of possible issues, but it is advisable to set up "other" classes to meet the exhaustion because of the inability to fully anticipate the content beforehand. If the ≥10% Analysis Unit is classified as "other", the classification is improper. Classes and classes are not coincident, if there is a unit of analysis can be grouped into two or more categories, the classification is improper, or the analysis unit is not allowed.
How many classes are appropriate? Too many categories can result in less statistical significance for some types of analysis units, too few categories, and different types of analysis units grouped into the same category, which may obscure significant differences. Considering that it is easier to split the classification than the classification, it is advisable to recommend more than less. In the case of no way out, test 50-100 analysis units can be tested.
In this case the classification is shown in Figure 5--
Figure 5: The classification and description of user feedback in Wang Pu
The classification tends to be based on the problem-solving side (such as system problems are solved by the developer, functional requirements are resolved by the product manager, and the interaction problem is solved by the designer) to facilitate the improvement. For pages, products/processes with limited functionality can also be categorized by page or function. For this example, also for the function, the page classification, mainly reflected in the level two code, see below for details.
"Fifth step to make Code table test Code"
The coding table is refined on the basis of large classification to better focus the problem. In this step, it is necessary to be as meticulous and comprehensive as possible to ensure effective data analysis. The meaning of each encoding should be explained, especially if the coder is more than one person. In this case, part of the large classification of functional requirements is encoded as shown in Figure 6, whose principle is to cover all functions as much as possible. But for the system problem, the interaction experience problem, the sub code should be the content characteristic, is set in the other dimension.
Figure 6: Encoding Representation example
After the coding table is made, 50-100 analysis units (such as 100 feedback) are extracted to test the code (job) to see if it is enough. Of course, in the actual implementation of the coding (sixth step) process may find that the coding is not enough. Only the encoding can be added at this time and all the analysis units under that category will be encoded. Fortunately--and one of the advantages of content analysis--even if the problem is found, it can make up for errors without affecting the accuracy and integrity of the data itself. Not like questionnaires or on-site experiments, once implemented, there are errors can not make up, can only discard data.
"Step sixth collecting data implementation Code"
Data collection is based on the previously defined range and sampling criteria to extract data. Data is typically imported into Excel for processing. In this example, each user opinion (that is, the minimum Analysis unit) corresponds to one row (Figure 7).
Figure 7: Raw Data Sample
You can then start coding for each analysis unit, which is a purely manual process. At present, there are some text analysis software can be assisted, but these based on word segmentation technology of software intelligence is insufficient, and the implementation of cluster analysis of sample size is required (at least tens of thousands), more suitable for mass text data classification.
The encoding process can be aided by Excel's text filtering. For example, the author used more "contain XX keyword" of the screening method (Figure 8), can effectively improve the coding efficiency. In the process of the meaningless analysis units (in this case, there will be users in the feedback of the ads in their stores) to clean out.
Figure 8: Example of a text filtering feature
The preliminary coding results are shown in the following figure and are then entered into the analysis and reporting phase.
Figure 9: Sample Encoding Results
"Step Seventh analysis report Results"
The analysis consists of two parts. The quantitative part uses descriptive statistics to calculate the frequency and percentage of the entries in each category, each code class. If it can be combined with other variables, such as the user's star, can be more complex cross analysis, card-side testing, T-Test, variance analysis. The qualitative part is to comb all the items under a certain sub code and sum up the problem points.
It should be noted that frequency and percentage are not the only indicators of severity. For example, we found a large number of system problems this classification of user opinion, but basically said only the release failed and slow two points. Functional Requirements The total number of comments under this category is relatively small, but there are more than 10 kinds of problems. Therefore, the combination of qualitative and quantitative analysis can reflect the contribution of each classification to the overall experience (satisfaction or dissatisfaction), as well as the problem focus/dispersion degree. Examples are as follows:
Figure 10: Feedback Summary Example
Further analysis also includes inferences about the problems implied by appearances. For example, I found the problem shown in Figure 11.
Figure 11: Deep Analysis sample
In the enterprise, the content Analysis report's output form is more flexible, the key lies in effectively communicates the question, impels the problem to solve. In particular, the user feedback content itself than the user researcher's analysis induces more impact, more likely to cause empathy.
In this case, the author's output includes:
The original data Excel table, placing the problem in separate sheet (Figure 11) by category. In each sheet, you can use the filter function to look at specific aspects of the user feedback quote. Note that the encoding is converted to a literal description for easy understanding. Figure 12:excel Output Sample
User Feedback Summary ppt, the classification of the issues in the summary list for the project team to preach, and explore the solution. When you state a problem, you can refer to a typical user's quote. For example, 10 people who reflect the same problem can choose the exact words of the clearest or most emotive person. Summary
This paper introduces the content analysis and the application of content analysis in enterprise user research. The advantages of the content analysis are as follows: 1 compared with the experimental method, the questionnaire method, and so on, more save manpower and material resources; 2 content can be repeated analysis, the reliability of a larger 3, no direct contact with the object of study, 4, if the operation process problems, error easily be made up. Its disadvantage is: 1 human factors (such as how to classify, coding) will greatly affect the results, 2 coding work takes longer, monotonous.
SOURCE Address: Http://piglili.blogbus.co ... 60772605.html