Data analysis and mining
Baidu MTC is an industry-leading mobile application testing service platform, providing solutions for the costs, technologies, and efficiency problems faced by developers in mobile application testing. At the same time, we will share the industry's leading Baidu technology, written by Baidu employees and industry leaders.
1. Overview 1.1 the key to the success of a mobile app is marketing and product design, the core of data analysis and mining solutions is customer positioning in the marketing process and User Experience Improvement in the product design process. Providing the desired products and services to target users is the secret to the success of any mobile APP. How to find the target customers and understand their product requirements depends on the power of data analysis and mining. Whether it is customer positioning or user experience, the success of mobile APP products is no different from that of any other types of products.
User research can be carried out from two different dimensions: qualitative analysis is a method to discover new things from small data samples, mainly used in user experience surveys; quantitative analysis is a method to test and prove certain things using large data samples. It is mainly used for user behavior data analysis.
1.2 data analysis and Mining Process Specification data analysis and mining system construction is different from the traditional business operation system construction, and has its own characteristics and rules. Data analysis and mining is an important part of database Knowledge Discovery (KDD: Knowledge-Discovery in Databases, KDD is an extraordinary process of identifying effective, novel, potentially useful, and finally understandable patterns from a dataset.
The standard process of cross-industry data mining (CRISP-DM: cross-industry standard process for data mining) is a leading component in the KDD process model, and the consumption reaches nearly 60%, a Data Analysis and mining process model jointly drafted by EU agencies. The CRISP-DM consists of 6 different links, as shown in:
1. Business Understanding ):
The initial phase focuses on understanding the project objectives and the needs from the business perspective. At the same time, this knowledge is transformed into the definition of data mining problems and the preliminary plan for achieving the objectives.
2. Data Understanding ):
The data understanding stage starts from the initial data collection. Through the processing of some activities, the goal is to familiarize yourself with the data, identify the data quality problems, and discover the internal attributes of the data for the first time, or a subset of interest to form the assumption of Implicit Information.
3. Data Preparation ):
The data preparation phase includes all the activities for constructing the final dataset from unprocessed data. The data is the input value of the model tool. Tasks in this phase can be executed multiple times without any prescribed sequence. Tasks include selecting tables, records, and attributes, and converting and cleansing data for model tools.
4. Data Modeling ):
At this stage, you can select and apply different model technologies, and the model parameters are adjusted to the optimal value. Generally, some technologies can solve the same type of data mining problems. Some technologies have special requirements on data formation, so they often need to jump back to the data preparation stage.
5. Evaluation ):
At this stage, you have created a high-quality display model from the perspective of data analysis. Before the final deployment of the model, it is important to thoroughly evaluate the model, check the steps for constructing the model, and ensure that the model can fulfill the business objectives. The key purpose of this phase is to determine whether important business problems are not fully considered. After the end of this stage, the decision on the use of a Data Mining result must be fulfilled.
6. Deployment ):
Generally, creating a model is not the end of a project. The role of a model is to find knowledge from the data, and the acquired knowledge needs to be easily re-organized and displayed by the user. At this stage, you can generate a simple report or implement a complex and reproducible data mining process. In many cases, customers rather than data analysts are responsible for the deployment.
2. user behavior data analysis 2.1 the target user behavior data refers to the interaction behavior information between the user and the mobile APP. It is a quantitative analysis part of the user's research dimension, by analyzing user logon and operation logs, users can obtain information about the use of mobile APP products, user devices, and network environments.
2.2 Methods user behavior data is usually obtained by means of data tracking. By recording detailed user operation logs, you can understand the detailed interaction between users and products, and the device and network environment when the user accesses the mobile APP. Traditional data tracking methods require enterprises to develop their own information collection programs and log processing programs to achieve specific costs and Development workload. If they are compatible with platform differences at the same time, the costs will be even greater, therefore, it is not suitable for emerging mobile apps. You can use a mature statistical analysis platform to analyze user behavior data.
The 2.3 tool Baidu mobile statistics platform is a professional mobile APP statistics analysis tool released by Baidu. It supports ios and android platforms. Developers can easily embed statistical sdks to fully monitor mobile apps, learn product performance in real time, and gain accurate insight into user behavior.
Baidu mobile statistics platform provides powerful application statistics and analysis functions for mobile apps, including:
1. traffic sources: channel traffic comparison and channel segmentation analysis, accurate monitoring of data of different promotion locations, and real-time channel contribution;
2. Audience insights: Based on Baidu's massive data accumulation, multidimensional analysis and presentation of user profile information;
3. Terminal analysis: the device distribution is clear at a glance (device model, brand, operating system, resolution, networking mode, and operator );
Shows the Baidu mobile statistics interface:
2.4 The analysis result of output user behavior data is the user role profile, which constructs the user tag model. The acquisition of user tag data is mainly dependent on the data mining algorithm, the composition of the tag system varies with different industries, businesses, and users. A more professional industry user profile model is required, so we will not discuss it too much here. Shows an example of user portrait output:
3. User Experience Data Analysis 3.1 target a mobile APP to achieve success, in addition to meeting the functional needs of users, it must also provide a good user experience. User Experience refers to how products interact with the outside world and play a role, that is, how people "access" and "use" products. User Experience forms a user's overall impression on the enterprise or product, defines the differences between the enterprise or product and the competitor, and determines whether the user will patronize again. High-quality user experience is an important asset of an enterprise or product. It can improve the ROI and the conversion rate.
3.2 The premise for improving the user experience is to obtain user experience data. The user experience data can be understood through traditional direct contact with users, you can also use the internet-based remote online research to learn about users. The two complement each other and complement each other. Direct access to the user mode is conducted through user interviews and on-site surveys, with sufficient communication and remarkable results. However, the selection of target survey objects, communication costs, and sample size are limited by time and capital investment. The Internet remote online survey model realizes online offline problems. Through online Q & A, it can save costs and expand the sample size, which is a useful supplement for direct access to the user model. The two features are as follows:
The 3.3 tool platform is an extension and typical application of the crowdsourcing model developed by Baidu in software and product testing, it submits the testing of enterprise products to the public in the network community. It is a crowdsourcing platform that serves Baidu's own products and provides services to the public. The purpose of the Baidu public testing platform is to use the testing capability and resources of the public to complete the product experience with a large workload in a short time, ensure the quality, and feedback the experience results to the platform immediately, then, the platform management personnel collect information and hand it over to the developers, so that they can improve product quality and user experience from the user's perspective.
The platform provides the following types of test tasks:
- Quick judgment task: Generally, it is a simple single-choice question, and users can quickly complete the judgment.
- Questionnaire survey task: you only need to complete the online questionnaire survey to get the corresponding gift certificate rewards;
- Product fault finding task: experience a new product, submit bugs for the product or propose improvement suggestions for the product.
- Special Tasks: enterprises can set special tasks for specific purposes, such as the ongoing job of recruiting ideas from suntech education institutions.
- On-site survey tasks: The survey targets are recruited to participate in on-site communication by initiating on-site survey tasks.
Shows the homepage operation interface of the Baidu public test platform:
For more information, please pay attention to "Baidu MTC College" http://mtc.baidu.com/academy/article