For this step, in total, there can be two situations—either we have a specific question in mind and we need to find an appropriate dataset to analyze to answer that question, or we already have a dataset and the content of that dataset gives rise to a question we want to answer.
Either way, we need to have a specific direction to move forward even before starting the data science project.
Needless to say, as we continue to work on and explore a dataset, new questions and insights might come up that can alter our original direction. However, it is always better to start with a clear question in mind so that the dataset can be analyzed deliberately.
As an example, we will be working with a dataset provided by Kaggle, ...