Data items ensure data quality is the most important thing.
But as a developer, I've always had a passion for code that's far above the data, which is not supposed to be.
The importance of data quality is far more important than code because of the data-related projects.
Understanding the data is more important than optimizing the code, and optimizing the code is the icing on the cake if the data quality is guaranteed.
Responsibility is the soul of safety, standardization is the security of this.
There are times when the development cycle is short, the developer is impatient, does not do a complete test, or the office temperature is relatively dry at that time,
Cause their psychological more irritable, it is very easy to cause code quality decline, but these are not important, the most important thing is we need to have a
Standardized test procedures, regardless of the circumstances, the code output data needs to be confirmed, passed the test to be completed.
The standardization of testing can make us more comfortable with our own data quality.
A little bit of my skill,
Using Excel to create a small amount of data, manually according to their own analysis of the process, step by stage to make the correct output.
Then use the code to generate data in step-by-step, contrast with Excel. So if you can pass the code test,
The accuracy of the data can be determined to a large extent.
Because the big Data project itself is a large amount of data, you directly from the data table to see the data item is not easy to find problems,
When the amount of data is small, it is easy to find, but some problems, the amount of data is not found when the volume of large data when the
It's exposed? What should I do then?
Testing standardization of Big data projects