Keep the data in a foothold-on reliability and validity in user research

Source: Internet
Author: User
Keywords Variables can investigate very

In the user research work, how makes own data and the conclusion to be more persuasive, is very important question. Recently accumulated their own use of research reliability and validity of the notes collated, listed in the text, I hope to help you.

The quality of the investigation depends on the reliability and validity of the investigation.

Reliability mainly refers to the consistency and stability of the measurement results. That is to say that the conclusions and data reflect the user's most realistic and stable idea. When users answer questions, they tend to be influenced by the environment, time and local emotions, and the idea that they are not true is that there will be random errors.

Reliability is the measure of the impact of this random error on the user's ideas.

Validity is the extent to which you measure what you want to measure.

For a product, we now use most of the user interviews, surveys and usability tests. And in these several processes will involve the question which the reliability and the validity.

Validity and reliability in user interviews

1. Interviews should not be limited to the user

Any product project will be influenced by market environment, company strategy, Technical force, platform specification and fashion trend. The need for a product may come from users, products, technologies, interactions, and vision. Different positions to look at the product angle is not the same, the focus is not the same, find multiple roles to help the need to find the whole, do not omit, so we must understand their needs in advance. In order to make our research more targeted, comprehensive and useful. The degree of usefulness and overall degree is an important component of validity.

2. The Ingenious choice interview user

Usually, the initial depth of the user number of interviews will not be too much, so the user must grasp the appropriate conditions. The question of feedback can be comprehensive, reasonable and useful.

For example, to do a software on the Android platform.

First of all Android novice users and skilled users are required, skilled users can reflect the habitual operation of Android users, platform features, and long-term use of the accumulated views and suggestions, and novice users can better reflect the platform where the learning difficulties, Through our design to help users to reduce learning costs.

Second, non-Android users are also required to learn from the side why they don't use Android. To help the product to explore more potential users to provide direction.

Demographic information (education, occupation, gender, age) should be covered comprehensively. Users of different attributes value local differences. The demand will be different.

Includes users of competing products. By understanding the user's evaluation of competing products, we can extract the pros and cons of competing products, thus providing the direction for enhancing the competitiveness of products.

3. Must have experts

Experts are important carriers of information. Professor Li Le says there are three types of experts, user experts, manufacturing expert, marketing expert, he pointed out that the criteria for judging whether a person is an expert are: (1) proficiency in the use of a product, (2) comparable products can be compared, (3) the relevant new knowledge easily integrated into their own knowledge structure, (4) with 10 years of professional experience (5) Accumulate a lot of experience and have a knack in using experience; (6) to understand the relevant history (the product design history, technology development history, etc.), (7) Pay attention to product development trends, (8) knowledge chain or thinking chain is relatively long, mention any topic, they can talk about a lot of relevant information (9) The ability to make recommendations for improvement or innovation, their innovative or improved solutions, whose high level is embodied in the adoption of simple methods to solve complex problems.

For the Internet, experts should refer to user experts, development experts, design experts and product experts, who, with their extensive experience, have a comprehensive grasp of the industry's similar products, development and design patterns, history and development trends, and high levels of professionalism. They can offer us a lot of advice that we didn't expect. This is to ensure the use of research process, especially for the late questionnaire structure validity has a great effect.

Reliability and validity in questionnaire survey and analysis

In order to improve the work efficiency, the questionnaire survey often adopts the method of network survey, and the probability of reliability and validity is greater.

Recently, a number of satisfaction surveys have been observed using the scale plus structural equation model (SEM). Let's look at where reliability and validity problems may arise.

1. Theoretical model support

Because SEM is a confirmatory factor analysis, is to test rather than explore new models, therefore, the whole causal hypothesis must have strong theoretical support and strict logical framework. It includes the hypothesis of variable relation in the model, the selection of index, and even the expression of measure item. If the final output model does not conform to the theoretical model structure, then the model is not persuasive. For example, when using ACSI model as a theoretical model of satisfaction, does it really design questionnaires according to perceived quality, perceived value and customer expectation?

2. Guaranteed weight

The general sampling survey in principle is the more the better, but encountered fewer target users, as long as the guarantee of certain conditions on OK, sample size by confidence interval, sampling error range, according to the actual condition of the minimum sample size can be calculated. Common formulas are:

14N=Z2 蟽 2d2′> (n is sample amount, Z is confidence interval, D is sampling error range, 14 蟽 ' > is standard deviation, often take 0.5)

However, it is necessary for large samples of structural equation model, many variables involved in SEM, the relationship between variables is very complex and staggered, small sample size will lead to instability of the model, convergence failure and then affect the parameters of the model. Zhu Remote etc [1] in the literature pointed out that when the sample is below 100, almost all structural equation model analysis is unstable, more than 200 of the sample, only to call a medium-sized sample. To obtain a stable structural equation model structure, a sample quantity below 200 is discouraged. Some scholars combine the minimum sample size with the model variable, suggesting that the number of samples should be at least 10 times times the variable, a rule often cited. The more variables in the model, the higher the requirement for large samples.

3. Variables need to follow the principle

A. If the function relation of each variable in SEM model is linear, it is impossible to calculate the path coefficient by regression.

B. In the use of the maximum likelihood estimation, the variable must be a multivariate normal distribution, which requires the indicator to be normal distribution, otherwise it is necessary to normal processing of the index.

C. The degree of multiple collinearity between variables is low, otherwise there will be a large error in the path coefficients.

D. In the process of the establishment of SEM will be constantly modified to get a more perfect model, for example, when the factor analysis, if found that a measure of the corresponding factor load is too small, it will artificially delete the measure, but if the model is established, some variables corresponding to 4~5 measures, some variables only 1~2 measures, So we need to think whether the variables of only two measure items are fully interpreted, and does the only two measure items reflect the variables completely? If so, even if kmo, Bartlett, factor loads are passed, the validity is also difficult to guarantee. Therefore, the questionnaire needs to be repeated before and after the investigation, and constantly modify the problem, rather than arbitrary deletion. When I was a student, I made a similar mistake when I surveyed Taobao's satisfaction, in the model "interactive" fragment, the interaction is measured by four variables, in which "two-way communication" at the beginning of the design by 5 measure items, but the factor analysis test pass, directly to the factor load relatively small customer service, forum, Amoy Lake three removed, and finally, although the data passed the reliability validity test, but only Ali, message board These two measures support is absolutely cannot explain "two-way communication".

4. Data quality is the root cause

To make the model structure stable and effective, first of all, we should guarantee the data quality and test the reliability of the questionnaire.

A. Consistency at different times.

When designing a questionnaire, you can repeat the same question to the same person, if the answer to the two questions is inconsistent and the correlation coefficient (Pearson R) is less than 0.7, then the reliability of the questionnaire is worth considering.

If the questionnaire is large enough to be divided into two (each sample should be guaranteed enough sample size), two models are established respectively, and the stability and applicability of the model can be tested by comparing the difference of the parameters in the two models. If the difference between the two is too great, the model itself is problematic.

B. Different forms of coherence

The equivalent reliability, such as gamma coefficient, was detected by two questionnaires with different content equivalent but expression.

C. Internal consistency

The related questions in the questionnaire are the same target services, they are logically consistent, that is, homogeneous. First, the correlation between each measure item and the overall (item-total correlation) is measured, then the homogeneity of the related problems under the same variable is measured, and the corresponding method is chosen for the different questioning ways: for example, for the Likert scale method, the Chronbach coefficient test is used. In basic research, the reliability should be at least 0.80 to be acceptable, in exploratory study, 0.70 acceptable, 0.70-0.98 is high reliability, less than 0.35 is low reliability. The Kuder-richardson coefficient test is used for the right and wrong problem. In the internal consistency test, to see whether the question option is reversed, if the two questions are asked "is satisfied with the product", a 7 of the representatives are satisfied, 1 are not satisfied with the other 1 representatives of satisfaction, 7 representatives are not satisfied, this will affect the reliability. This situation should be adjusted in advance.

5. See further

The conclusion of the questionnaire is not only to solve the current problems and needs, there is a certain predictive effect, the market is changing, the current target user is not necessarily the future (or next version) target users, such as the target user's income may be increased trend, the use of a platform is rapidly increasing, The current satisfaction model may not be available after one months (such as the emergence of new feature points).

Suppose we want to the QQ audio and video satisfaction survey, now established a satisfaction model, but if the next month QQ audio and video more an important function, to the overall satisfaction of the promotion has a very large role, then, the model of the path coefficient will not change? The model may not apply in the next month, and the result is that the current satisfaction is not comparable to the next month's satisfaction, and much of the work is in vain. Therefore, such a study, such as satisfaction model, is to be repeated investigation, long-term monitoring and correction of the satisfaction model, in order to obtain the most stable model, we can make the model has a very predictable and comparison effect.

6. Attention to Detail

A. Questionnaire design can not be ambiguous, avoid too professional vocabulary and inductive vocabulary

B. There should be a clear distinction between options (mutual exclusion)

C. Avoid omission, the "other" option is required, and preferably with a input box, memory, each questionnaire can be from the "other" option to obtain a large amount of information.

D. The general title can not be too much, set the problem options, as much as possible to let the options randomly display, especially in the case of more options.

E. In the process of data processing to remove duplicates contradictory items, it is best to count the time lag when users fill out questionnaires. If the whole filling time is very short, it can be determined that the user did not seriously fill out.

F. Extreme, outlier options may be considered for deletion.

Reliability and validity in usability testing

First of all, to ensure that the host's attitude cordial, before testing casual chat with each other familiar, test outline clear and comprehensive. In addition, the following points are important to ensure the reliability and validity of the test.

1. Do not ignore the whimsical

The need for each other in the brain violence can not criticize, in the interview or test, also can not make comments on some of the user's operations, otherwise the user is likely to hide the true feelings of the heart. Focus on and record user errors, but be neutral when the user goes wrong.

Usually, the user in the experience of the real prototype, will produce a lot of seemingly whimsical appeal, some although at present can not be achieved, but will provide a lot of ideas and direction for future development. Therefore, we should actively encourage users to divergent thinking.

2. Pre-and post-verification, competitive product comparison

After the test is completed, you can add a general questionnaire that allows users to review and compare the various functional points of their experience, as well as to verify that the user experience process and final attitude are consistent and one. If there is inconsistency, the reason should be further questioned to determine the user's true ideas.

When testing, let the user experience competing goods, and make a comparison, is also the way to find effective information.

3. Keen observation

In the test, in addition to the fixed outline for the question and answer, the process is also keen to observe the user some subtle expression, stay, thinking. Not only to understand the user on how to evaluate the function, but also know how the user to do a task in the process, how to think, plan, implement, the user's first reaction, habitual operation, thinking of the role of the line is far more than simple evaluation. After the user task is completed, ask the user why they are doing so.

4. Record the exact words and habitually confirm

The test conclusion must have the user's exact words support, cannot change the user's expression easily. and the user communication process, the habitual question: "What do you mean ...?" I understand what you mean, you see, right ...? To ensure the validity of the test conclusion.

5. Conduct household surveys when necessary

First, the household survey will greatly reduce the impact of external environment, users in their own space, will be more realistic reflection of common problems. Second, the household survey is usually after the user portrait extracted, according to the user portrait description attributes, consciously targeted to select the object with some typical attributes of in-depth, comprehensive, systematic investigation (typical survey), such as a product target users, they reflect the problem, representative strong, often have ten effect, Avoid interference caused by non-targeted user information.

6. User conditions and quantity

Participants in the test are selected according to the target user characteristics.

The general measure of whether the test needs to continue is to see if a new problem is found, and if there are new problems, it should continue, or it may end.

Neilson research results show that 5 users of the test can find 85% usability problems. In our previous usability testing experience, the number of users was generally set at 6, which basically found all the problems. Of course, any number is just a reference, the number of users is best based on the specific test situation (measuring time, resources, input-output ratio). In short, the key is whether there are new problems.

Reliability and validity throughout the entire user research process, there will certainly be a lot of places not taken into account, but also invites you to pat the bricks lightly.

Reference

1. Zhu Long distance, Madong, "talk about the structural equation Model application Strategy", [Enterprise Management],2010

2. Professor Li Le 2010 Tencent Speech

3. http://www.useit.com/

4. Liu Jinlan, "American Customer Satisfaction Index" [],2005 of Management report

Copyright ©2008–2010


you see the article from the Tencent CDC blog http://cdc.tencent.com, the original link for http://cdc.tencent.com/?p=2897, reprint, please specify the source.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.