Big data, has become synonymous with the era, today's internet belongs to the era of big data, the advent of the era of big data, subversion of the past to the data of the inertial way of thinking, to ensure that data execution, software quality, testing quality, data use scenarios, and so on, all need to re-transform a new perspective, the software for a more comprehensive
Before big data is rarely tested, development will feel: the test environment is not so much data, how do you measure? Aside from the big data of the large number of features, the fundamental, he is also for the business services, there is a sentence I very much agree: all technology is for business services, off the business of technology is worthless, this sentence in the big data age today, still applies, and will continue to apply. The job of testing is to ensure the correctness of the data, the business logic is correct. Big Data script also has input, output, this is somewhat similar to the function test in the background logic test, no interface, everything is the background server processing, testers must be clear the entire processing process, each data flow, each step input and output, to determine the final output is correct, For big data testing, too, we need to understand the function of each script, the input and output of each script, the overall data flow process, to determine whether the big data implementation is correct.
A data script or a piece of data calculation logic, in the big data run the right premise, it must be the function is correct, this is our testers first to ensure that, today I want to from the point of view of functional testing, discuss the function of big data How to do, use case how to design, to reach a broader, better guarantee its correctness.
1. Writing test Cases
Functional testing is a common method of writing test cases: equivalence classes, boundary values (both of which are estimated to do the testing), and also for big data test writing use cases, unlike the usual sense of functional testing, his input is no longer an input box, It is a database field or a data set that has a special meaning (containing multiple data).
Let's review the methods of two commonly used functional test design cases for equivalence classes and boundary values. First, divide the equivalence class: refers to a subset of an input field. In this sub-collection, each input data is equivalent to the error in the disclosure program, and it is reasonable to assume that the test of the representative value of an equivalence class is equal to the other value of this category. Therefore, all the input data can be divided into several equivalence classes reasonably, in each equivalence class to take a data as the test input conditions, You can use a small number of representative test data. Good test results are obtained. The boundary value is a supplement to the equivalence class, and its test cases come from the boundary of each equivalence class use case.
So how are these two methods used in the writing of big data test cases?
Take the example of a big data script we tested earlier, the main function of the script is to count the amount of orders for a particular store day, and calculate the daily profit of the store according to the different rebate rules for each item set.
First enter the analysis criteria:
1. Designated shops
2. Designate a day
3, different time, different goods, different rebates
Item 1:2016.12.6 13:00:00------2016.12.6 15:00:00 Rebate is 5%
Item 2:2016.12.7 00:00:00------2016.12.7 23:59:59 Rebate is 15%
All goods, except the specified time, the rebate is 1%
His equivalence class is no longer an input, but a condition that satisfies this condition, we row to the valid equivalence class, does not satisfy this condition, we divide into the invalid equivalence class, but the data on the condition boundary is our boundary value.
Use Case Partitioning results:
Other methods of writing functional test cases, such as scene analysis, branch coverage, can also be used to write big data test cases, any test can not be divorced from the actual business, simple test data, or simply test input, it makes no sense, we must combine different scenarios, design more comprehensive, more efficient test cases.
2. Prepare test data
According to the test cases written, prepare different types of test data, this is the same as the functional test, the test data is not how much, but the comprehensiveness of the coverage, if you have prepared thousands of data, but the same data type, the coverage of the Code branch is also one, that the data only one can be called effective test data, All the rest are invalid test data.
There are several ways to prepare the test data:
1) Write your own SQL single insert
2) using Stored procedures
3) export data from the wire and import directly into the test environment.
At the same time, to prepare the test data, as far as possible and the actual data consistent, such as the value of time, accurate to the hour or seconds or only to the date of the year, there is the amount of retention of several decimals.
3, execute the test script, check the test results
Once you have the test data ready, you can execute the test scripts, either on the Hadoop platform or on other platforms, but these are just an operation, similar to how we learn how to use a tool, know how to run the script, and then return to the test. What the testers have to do is use the prepared data, execute the script, check that the expected results are consistent with the actual results, and determine if the script logic is correct, which is exactly the same as our functional testing.
So, regardless of the type of test, the testing process is universal, testing methods are available for reference, we have a sufficient number of test basis and testing methods, we can easily deal with a variety of different tests.
On big data testing from the perspective of functional testing