(i) Conventional methods of estimating the workload of the test
As a manager, are you asked how much time, how many human tests a project will take, or, as an ordinary tester, are you asked how much time is spent to complete a task or a regression test? I think most people in the software industry are more or less confronted with the question of workload estimation. So, what did you say? Do you have confidence in your own answer? Did you finally find that the actual time spent is quite different from the original estimate?
Different people use many different methods to estimate and schedule their testing effort. Different organizations use different methods depending on the type of project, the inherent risk of the project, the technology involved, etc. But most of the time the test effort is combined with the development effort, without a single number.
First, let's take a look at some of the usual ways to estimate the amount of testing:
1. Ad-hoc method
The test effort under this method is not based on any definite deadlines. The work continues until a predetermined timetable is reached by the management or marketing staff. Or, until you run out of budget funds.
This situation is prevalent in very immature organizations, and there are often 100% of erroneous deviations.
2. Percentage of development time method percentage of development
The basic premise of this approach is that the test workload depends on the development time/development effort. First, the development effort is estimated using, for example, the LOC or FP method, and then uses some exploratory methods to limit the workload of the test. This approach varies greatly and is often based on previous experience.
35% of the total time spent on a project is usually reserved for testing.
- 5-7% to component and integration testing
- 18-20% to System Testing
- 10% to receive test (or regression test, etc.)
3. Analogy (empirical value method or historical data method)
Estimate workload based on experience or historical data accumulated from previous or similar projects (primarily in terms of project nature, field, scale, and similar). The accuracy of the estimation results of the analogy method depends on the completeness and accuracy of the historical project data, therefore, one of the preconditions for using the analogy method is to establish a good post-project evaluation and analysis mechanism, which can be relied on to analyze the historical project data. The following related historical data needs to be collected:
- Time spent in the design and implementation phases
- The size of the test work, such as the number of user needs, number of pages, function points
- Data styles, such as entities, number of fields
- Number of screens or fields
- The size of the test object, such as Kloc
4.WBS (work breakdown structure) estimation algorithm
The project or product is broken down into specific work, and then the individual work is estimated by time, and the final sum is the test effort/time of the project or product.
5.Delphi method
Delphi is the most popular expert evaluation technique, and in the absence of historical data, this approach reduces the variance of estimates. The Delphi method encourages participants to discuss issues with each other. This technique requires the participation of a variety of relevant experience people to persuade each other.
The steps of the Delphi method are:
1. The Coordinator provides the project specifications and the estimate form to the experts;
2. The facilitator will convene the panel to discuss the factors related to scale;
3, each expert anonymously fill in the overlapping representative lattice;
4. The facilitator organizes an estimate summary to return the expert in the form of an iterative representation;
5. The facilitator will convene a group meeting to discuss the larger estimation differences;
6, the expert review estimate summarizes and submits another anonymous estimate on the iteration representative;
7. Repeat 4-6 until a minimum and maximum estimate is reached.
6.PERT Estimation method
PERT estimates the completion time of each project activity in three different scenarios: the desired size of a product, a minimum possible estimate, and a maximum possible estimate. Use these three estimates to get a pert statistical estimate of the product's expected scale and standard deviation. Pert estimates can get the expected value of the line of code E, and the standard deviation SD.
(ii) Code line analysis method
The estimate of the test workload is often closely related to the size of the software development. Many software companies often estimate the size of the software they are about to develop, and then sum up the project's final workload estimate. This method is more suitable for companies or projects that have experienced accumulation, testing, and development model stability. Provides a more accurate, reference-based number. But at the same time, because it relies entirely on its premise-the estimation of the development workload, it is more fragile, if the development workload has a large deviation, the test workload will become useless. In addition to the code line itself there are many problems and limitations, Therefore, it is necessary to understand the principle of the project estimation method when it is chosen, and the advantages and disadvantages are used selectively.
Line of code, which is the source with the English line of code. So the code line analysis method is to the software product source code to measure the number of lines. But if you think about it, you may have the following questions:
- is the number of physical rows calculated or the number of commands for the program?
- are empty rows calculated?
- Is the comment calculated?
- Is the predefined file calculated?
- How are different versions calculated?
- Is there a set of rule definition issues in this design?
- The configuration script for the development process, is the compilation script calculated?
- How is a shared file (such as a header file for a shared Development library file) calculated?
Now the general rule is to calculate the number of physical rows, not to calculate empty lines, and not to calculate annotations. For other options, all files under the root directory of the source file are calculated. So the line of code refers to all the executable source code lines, including the Deliverable Work Control language (Jcl:job controls Language) statements, data definitions, data type declarations, equivalence declarations, input/output format declarations, and so on. Commonly used units are: SLOC (single line of code), KLOC (Thousand Lines of code), Lloc (logical Line of code), Ploc (physical line of code), NC LOC (non-commented line of code), DSI (delivered source instruction). Among them, Sloc and Kloc are more commonly used.
The Code line analysis method is meaningful to the technician because it does reflect the size of the software in some way and is physically measurable. But there are many problems with this approach.
- In the requirements, planning, design phase because there is no line of code, you need to rely on estimates to solve. The overall estimate accuracy is not high unless there are years of similar project experience. The degree of accuracy of the estimate depends on the data of the same project and the experience of the estimator. In the coding, testing, implementation phase can be directly counted out.
- The ability to meet the customer's requirements and reflect the progress of the process is unsatisfactory, and is of little significance to the manager. Therefore, it is difficult for the project to track the overall number of lines of code to take action.
- Recently, a large number of visual programming tools, as well as Template Library, class library widely adopted, in the results of the program has a large number of automatically generated code or complex automatic configuration script or resource file settings, in the project using these tools, the use of the code line analysis method to obtain the significance of the value has been greatly reduced.
- For different programming languages, the line of code also lacks a trusted way to convert.
Although the line-of-code approach has many drawbacks, it is recommended to use code lines as a reference and a complement to software project management due to its ease of use and low operating costs (if appropriate support tools are used).
(iii) COCOMO model
As a measure estimation method, the code line analysis method has been widely developed in the 20th century 80 and 90, and has developed many parameter models for estimating workload and schedule in the industry, the most famous of which is the COCOMO model, the latest version of which is the Cocomo II model.
COCOMO, English is all called the constructive cost model, the Chinese is the structural costing models. It is an accurate, easy-to-use, model-based cost estimation method that was first proposed by Bum (Boehm) in 1981. In essence, it is a parameterized project estimation method, parameter modeling is to use some characteristics of the next target as parameters, by establishing a digital model to predict the cost of the project (similar to the residential area as the parameters of the overall cost of housing).
In the Cocomo model, the workload adjustment factor (Effort adjustment Factor, EAF) represents the combined effect of multiple parameters that enable the project to be characterized and normalized according to the items in the Cocomo database. Each parameter can be positioned very low, low, normal, high, Very high. Each parameter is a multiplier, and its value is usually between 0.5 and 1.5, and the product of these parameters is used as a factor in the cost equation.
Cocomo uses 3 different levels of models to reflect varying degrees of complexity, respectively,
- Basic model. is a static univariate model that calculates the software development effort with a function that has an estimated number of source code lines (LOC) as an argument.
- Intermediate models (intermediate model). Then, on the basis of calculating the software development workload with the function of LOC as the independent variable, we use the influence factors of product, hardware, personnel and project to adjust the workload estimation.
- The detailed model (detailed models) includes all the characteristics of the intermediate COCOMO model, but when adjusting the workload estimation with the various factors mentioned above, the influence of the analysis and design steps in the software engineering process should also be considered.
According to different application fields of different application software, the COCOMO model is divided into the following 3 kinds of software application development modes:
- Organizational model (Organic mode). The main feature of this application development model is the development of a project in a familiar and stable environment, with many similarities between the project and other recently developed projects, which are relatively small and do not require much innovation.
- Embedded application development model (Embedded mode). In this type of application development model, the project is constrained by the interface requirements. The interface is very important for the development of the entire application, and requires great innovation, such as developing a brand new game.
- Intermediate application development Model (semidetached mode). This is the type between the organization mode and the embedded application development pattern.
The COCOMO model has the characteristics of accurate estimation and easy to use. The basic quantities used in this model are as follows: (1) DSI (number of source instructions), defined as the number of lines of code, including all code except comment lines. If there are two statements on a line, an instruction is counted. (2) MM (measured in person months) represents the development effort. (3) Tdev (measured in months) indicates development progress, which is determined by the workload. (4) The COCOMO model focuses on 15 factors that affect the workload of the software, and by defining the multiplication factor, the workload of the software can be estimated accurately and reasonably.
But Cocomo also have some serious defects, such as the input when the analysis of priority, can not handle the unexpected environment transformation, the resulting data can not be used directly, need to calibrate, only to get the summary of the past, for the future situation can not be calibrated.
(iv) One of the functional point analysis methods-principle
The function point analytic method (Fpa:function points analysis) is a kind of relatively abstract method, which is a kind of "man-made Design", which mainly solves the problem of how to measure the size of software in an objective, impartial and repeatable way.
FPA law by IBM's engineer Allen · Hotel Elbroich (Allan Albrech) was introduced in the 1970s and was subsequently inherited by the international method proposed by the International Functional Point User Association (Ifpug:the IFPUG function points users Group), from the system The complexity and system characteristics of these two angles to measure the size of the system, characterized by: "In the case of the external model is determined to measure the size of the system", "can be measured from the user perspective of the size of the system". Function points can be used in the requirements document, design document, source code, test case measures, and the function points can be converted to lines of code, depending on the method and programming language. Various functional point estimation methods have become international standards through ISO organization, such as: ①, Canadian Allen · Aibun (Alain Abran) such as the comprehensive function point method (full function points), ② British Software Measurement Association (uksma:united Kingdom Software Metrics Association) proposed The IFPUG functional point method (IFPUG function points) ③ The Mark II FPA function point method (Mark II function points) proposed by the British Software Metrics Association; ④ Netherlands Functional Point User Association (Nefpug:neth erlands function Point Users Group) proposed NESMA feature points method, and Software Metrics Common Association (cosmic:the Common Software Metrics Consortium) proposed cosmic-f FP method, these methods belong to the development and refinement of the hotel Elbroich function point method.
The function point analysis method includes two parts, part is the concrete steps and methods of the measurement, usually called the function point Scale measurement method (functional size measurement, FSM), and the other part is the specific application of the function point analysis method. Unless otherwise stated, Typically, this is not a separate discussion, but is collectively referred to as the function point analysis Method (functional, FPA), which includes appropriate project management activities for scale measurement activities of the application software and subsequent application of measurement results.
The function point analysis method has some relatively complete, self-system concepts, including basic functional components (base function Component, BFC), BFC type, boundary, user, localization, functional area, function scale, function point scale measurement range, function point scale measurement process, 15 key concepts such as function point scale measurement, functional requirements, quality requirements, technical requirements, numerical adjustments, and adjustment factors.
The basic count of function point analysis is the number of each element contained in a system (or module) calculated according to the standard:
① external input number (ei:external input): Calculates each user input, which provides application-oriented data to the software. The input should be separated from the query and calculated separately.
② external output number (Eo:external output): Calculates each user output, which provides application-oriented information to the software. Here, the output refers to reports, screens, error messages, and so on. A single item of data in a report is not calculated separately.
③ external queries (eq:external query): A query is defined as an online input that causes the software to produce a real-time response in the form of an online output. Each of the different queries is calculated.
④ Internal logic files (ilf:internal logical file): Computes the primary file for each logic, such as a logical combination of data, which may be part of a large database or a separate file.
⑤ external interface files (eif:external interface file): Computes all machine-readable interfaces, such as data files on tapes or disks, that can be used to transfer information from one system to another.
Reprint: http://www.uml.org.cn/xmgl/201007295.asp
How to estimate the test workload