Applying ML is a highly iterative process
Idea->code->experment->, .....
To constantly adjust the hyper-parameters.
Train/dev/test Sets
Datasets are typically divided into train/dev/test sets.
Training set: Training for Models
Hold-out Cross Validation Set/developmet set: For testing, tuning model hyper-parameters
Test set: for final evaluation
Previous ML problem: Data scale at w level, usually 70/30 divided by train/test set or 60/20/20 scale.
Now the Big Data Age: Data scale in the hundred W, trend is the proportion of dev/test set is reduced, because the dev set only needs to be large enough to judge the merits and demerits of different models, the test set is similar. If there is 100w data set, the ratio of 98/1/1 is a more reasonable division.
Experience:
- Make sure that the dev/test set comes from the same distribution, which helps to improve the efficiency of the algorithm.
- The training set can take a page crawl and so on to obtain a larger amount of data.
- There is no test set available, only with the dev set to adjust.
Bias/variance
You can determine whether the model has a high bias/variance problem by train set error and dev set error.
Suppose, for example, that Bayes error is about 0%, which means that humans can be almost completely accurately identified, and the train and dev sets come from the same distribution.
Train Set Error |
Dev Set Error |
Results |
1% |
11% |
High Variance |
15% |
16 |
High bias |
15% |
30% |
High Variance & High bias |
0.5% |
1% |
Low Variance & Low bias |
Some Basic Recipe (Basic guidelines)
STEP1 High bias? -Greater network, longer training time; (Other NN architectures)
STEP2 High variance? -more data, regularization; (Other NN architectures)
In the age of deep learning, there is no need to think too much about the trade-offs between bias and variance, and if reasonable regularization is adopted, a larger network will reduce bias without increasing variance, if there is a large enough network, Getting more data will hardly increase bias in reducing variance, which is one of the most effective reasons for deep learning in the field of supervised learning.
Wunda Deep Learning Special Course 3 study notes/week1/setting up ML application