Why do we need a validation set and test set? What is the difference between them?

Question

Why do we need a validation set and test set? What is the difference between them?

1 Answer

sharadyadav1986 · Answer 1 · 2023-06-10T17:27:53+0000

When training a model, we divide the available data into three separate sets:

The training dataset is used for fitting the model’s parameters. However, the accuracy that we achieve on the training set is not reliable for predicting if the model will be accurate on new samples.

The validation dataset is used to measure how well the model does on examples that weren’t part of the training dataset. The metrics computed on the validation data can be used to tune the hyperparameters of the model. However, every time we evaluate the validation data and we make decisions based on those scores, we are leaking information from the validation data into our model. The more evaluations, the more information is leaked. So we can end up overfitting to the validation data, and once again the validation score won’t be reliable for predicting the behaviour of the model in the real world.

The test dataset is used to measure how well the model does on previously unseen examples. It should only be used once we have tuned the parameters using the validation set.

So if we omit the test set and only use a validation set, the validation score won’t be a good estimate of the generalization of the model.

Why do we need a validation set and test set? What is the difference between them?

Please log in or register to answer this question.

1 Answer

Related questions

Top Trending Technologies Questions and Answers

HOT LINKS

TRANDING TECHNOLOGIES

CONTACT US

Follow us on Social Media