6

Designing Good Validation

In a Kaggle competition, in the heat of modeling and submitting results, it may seem enough to take at face value the results you get back from the leaderboard. In the end, you may think that what counts in a competition is your ranking. This is a common error that is made repeatedly in competitions. In actual fact, you won’t know what the actual leaderboard (the private one) looks like until after the competition has closed, and trusting the public part of it is not advisable because it is quite often misleading.

In this chapter, we will introduce you to the importance of validation in data competitions. You will learn about:

  • What overfitting is and how a public leaderboard can be misleading
  • The dreadful shake-ups ...

Get The Kaggle Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.