Test Design Method

Source: Internet
Author: User

In some scientific fields such as medical research and agricultural research, it is first to design an experiment to collect data and draw conclusions. In fact, in other areas of data analysis is the same, but the concept of the experiment is different.

We know that data mining is the exploration of data in the absence of a definite analytical purpose in advance, to find the unknown potential relationship between data and to provide inspiration for the analysis of the problem. The first thing to do in data analysis is to clarify the purpose of analysis, according to this analysis to collect, collate data, and then select the corresponding analysis method, the conclusion, the whole process is the only constant is the analysis purposes, the remaining steps may be repeated, and constantly corrected. Then the experimental design is the concrete embodiment of the whole process of cohesion.

The experiment design needs the combination of theory and practice, the different experimental design should adopt the corresponding analysis method, the experiment design is directly related to the analysis work can be carried out correctly, must be carefully treated. The data collected by the experimental design is usually aggregated using a list of tables, and different experimental designs are made, and the list is structured differently.

First, introduce the common terms and definitions to be used in the experiment design.

1. Indicators

The result value produced by various experimental factors or levels, that is, the actual value obtained in the experiment, can also be called the dependent variable of the experimental process, the index can be quantitative index or qualitative index, the corresponding index value is continuous and discrete

2. Factors

The reasons or factors that influence the experimental index can also be referred to as the independent variables in the experimental process. In fact, we want to analyze the content, we in the selection factor analysis of the time to grasp the main factors, control non-major factors.

3. Level

The level is the different state of the factor in the experimental process, which can be a numeric or character description. Since our main analysis factors on the impact of experimental indicators, then according to the number of factors to divide the experimental design method, can be divided into single-factor experimental design and two factors and above experimental design:

One, single factor experiment design method

Single-factor experimental design is mainly for the introduction of only one factor of two or more levels, note that the introduction of the meaning of the factors affecting the experimental indicators are usually many kinds, and we just want to analyze one of them, according to experience, the analysis of the factors often influence is the strongest.

1. Completely randomized design method:

The samples were randomly divided into different groups and each group was tested at different levels. Or randomly extract a certain sample from different populations to accept different levels of testing. Make sure that each group of samples or samples receives only one level of processing.

This experimental design is characterized by random: random sampling, random grouping, random configuration, and each group or sample has equal opportunity to accept any one level of processing.

The advantage is: The method is simple

The disadvantage is: only one factor is analyzed, regardless of the differences between samples, this will increase the error, and the homogeneity of the sample requirements are higher, more suitable for large samples.

Each group of randomly sampled samples can be equal or unequal, but they are more efficient at the same time, and at design time, the sample quantity of each group should be as equal as possible. There are also 22 comparison issues between multiple processing groups.

2. Pairing design

Paired samples we have encountered before, referring to the same or similar two individuals to accept the same or different experimental processing, here to note two keywords:

(1) Two individuals: must be two individuals to be called pairing, if multiple individuals are compatible (as can be said below), pairing is one of the common uses of compatibility.

(2) Different experimental treatment: the different tests can be different levels of the same factors, can also be different factors of the same level, therefore, matching design can be either single-factor experimental design can also be multi-factor experimental design, if it is a multi-factor experiment design, need to consider the interaction.

Note: The factors that we are compatible with cannot be an experimental factor for analysis, and cannot interact with the experimental factors.

Matching design can be roughly divided into four kinds

(1) Before and after the matching design: the same individual in the experiment before and after the index comparison or the same individual to accept the two levels of treatment of the index contrast, emphasizing the same individual.

(2) Self-matching design: two parts of the same individual in the experiment before and after the test and the indicators of comparison or accept the two levels of treatment of the index contrast, emphasizing the same individual two parts, this design generally appears in medical research, such as the left and right eye, left kidney, etc., in other analysis rarely used.

(3) allogeneic matching design: Refers to a group of individuals with similar or identical conditions, one as the experimental group received the experimental treatment, the other as the control group did not accept the experimental treatment, the results were compared.

(4) Cross-matching design: A special pairing design, which through the introduction of a phased concept, it can be the impact of time factors in the experimental design, the specific operation method is: According to the pre-defined test sequence, a part of the sample in the first phase of a test treatment, the second stage to accept B experimental treatment, The order is AB, then the remaining samples are treated in the first stage by B, the second stage is treated with a experiment, the order is BA, the two kinds of experimental treatment are crossed in the whole process, and the stage and treatment can be increased.

The advantage of pairing design is that the difference between individuals = 0, so the sampling error is very small

The disadvantage is: sometimes it is difficult to deal with the non-processing factors and other conditions to control the exact same, if the allogeneic pair, pairing requirements are higher.

3. Compatibility design

The compatibility design is also called the Random block design, first the same or similar samples are paired, called the compatibility group, and then according to the principle of random allocation of these compatibility groups to give different experimental treatment.

We can think of this, the whole experiment into a number of relatively independent units, each unit set up a complete series of experiments, such a unit can be considered as a compatibility group or a group, the conditions of the sample in the group should be basically consistent, and there are obvious differences between the district groups. When each block contains only two individuals, it is paired with the design of the experiment. As with the pairing design, the compatibility design can also be a single factor or multi-factor design, if the multi-factor experiment design, need to consider the interaction.

Note: The factors that we are compatible with cannot be an experimental factor for analysis, and cannot interact with the experimental factors.

Because of the grouping, the error is divided into two parts: part is the intra-group error, due to the consistent conditions within the group, this part of the error can be considered as random error. The other part is the inter-group error, which can be precipitated from the total variation, thus making the random error more pure.

The advantages are: the interference of non-experimental factors is discharged, the sampling error is reduced, and the efficiency is higher.

The disadvantage is that, due to compatibility conditions, it is sometimes difficult to match samples successfully.

Two, two factors and above experimental design method

1. Latin Side Design

The Latin side design is mainly used to analyze the effects of 3 and 3 factors on the experimental results, but it is most commonly used in 3 factor analysis. The specific design methods are:

The need to analyze the factors as a processing factor, in the Latin alphabet, the other two factors are represented by the ranks, thus constituting a data square, it is called the Latin side design, such as 4 of which the letter processing factors require random

There are some prerequisites for using the Latin side design

(1) There can be no interaction between the factors, or the interaction is negligible
(2) The level of each factor must be equal
(3) Data cannot have missing values
(4) The level of each factor cannot be duplicated

Advantages of the Latin side design:

In the compatibility design, we do the compatibility to eliminate system errors, but if the system error from two directions, then we will set two direction of the zone group to eliminate, the Latin side of the row is the group of two directions, it makes the experiment further regional grouping and equalization, can fully show the difference between processing, more efficient. The disadvantage is that those preconditions, which are mentioned earlier, are highly restrictive.

2. Factorial design

Factorial design is a comprehensive combination of two or more than two factors of various levels of the experimental design, it is a variety of combinations of experimental design, can be analyzed:

(1) At some level of other factors, the difference between the different levels of a factor (individual effect)
(2) The average difference between the different levels of a factor (main effect) under the condition of all other factors
(3) Each individual effect of a factor changes with different levels of another, and draws the best combination (interaction)

When choosing the analysis as the experimental design, we should pay attention to several points:

(1) At least two independent repeat experiments per group of horizontal combinations
(2) In the specific experiment, all the factors are applied at the same time, in other words, the experimental factors are not staged in batches in the experimental process
(3) When conducting statistical analysis, it is equally important to treat all factors as affecting the observation indicators.

Factorial design is a comprehensive and efficient experimental design method, but because it is a comprehensive experiment, so the analysis of factors and levels should not be too much, otherwise the calculation is very cumbersome, generally 4 factors within the best.

3. Orthogonal design

The disadvantage of factorial design is that the number of experiments is too many, and some of the experiments have not much meaning can not do, orthogonal design is the use of orthogonal table, in a comprehensive experiment, the selection of scientific representative of the horizontal combination of the experiment is part of the analysis of the design of the implementation.

Orthogonal design is especially suitable for the case that the number of factors is more and the level of each factor is relatively low.

Orthogonal tables used in orthogonal design can be divided into:

(1) Equal horizontal orthogonal table

Orthogonal table with equal number of factor levels

L: Orthogonal table code
N: Orthogonal table traverse number (number of trials)
R: Number of factor levels
M: Number of Orthogonal table columns (maximum number of factors to be arranged)

Such horizontal orthogonal tables have the following characteristics:

Any column of the <1> table, with different numbers appearing the same number of times
The same number of peers in any two columns of the <2> table appears

(2) Mixed horizontal orthogonal table

Orthogonal tables with different levels of factors not identical

Basic principles for selecting orthogonal tables

<1> determine the experimental factors, levels and interactions, the main factors can choose several levels, the secondary factors can be selected less than a few levels
<2> See level, if horizontal is equal, then horizontal orthogonal table, otherwise mixed horizontal orthogonal table
<3> each interaction should occupy one or two columns in the orthogonal table
<4> If the experiment requires high precision, it is advisable to select the orthogonal table with many experiments.

4. Uniform design

We know that the orthogonal design is particularly applicable to the number of factors and the level of each factor is relatively small, and the orthogonal design according to the orthogonal selection of the test point, and follow two characteristics: The test point is evenly dispersed, the test point arrangement regularity is neat, but when the number of factors or levels of large, orthogonal design experiment times are still very large, And in order to take care of orderly arrangement, orthogonal design does not fully achieve uniform dispersion.

Uniform design only consider uniform dispersion, regardless of arrangement, so uniform design is better than the orthogonal design uniformity, the test point has a better representation, because the arrangement is not considered neatly, so can greatly reduce the number of experiments.

The Uniform design table symbol is represented as follows:


The uniform experiment is suitable for many factors, so it is more used in the preliminary investigation stage of the experiment (many factors are investigated, and the factors of investigation are more extensive).

The biggest disadvantage of uniform design is that it is computationally complex and requires two response surface regression using nonlinear regression.

Test Design Method

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.