Some datasets are nice to see but complicate to process further. Take a look at the matches file we saw in Chapter 3:
Match Date;Home Team;Away Team;Result 02/06;Italy;France;2-1 02/06;Argentina;Hungary;2-1 06/06;Italy;Hungary;3-1 06/06;Argentina;France;2-1 10/06;France;Hungary;3-1 10/06;Italy;Argentina;1-0 ...
Imagine you want to answer these questions:
- How many teams played?
- Which team converted most goals?
- Which team won all matches it played?
The dataset is not prepared to answer those questions, at least in an easy way. If you want to answer those questions in a simple way, you will first have to normalize the data, that is, convert it to a suitable format before proceeding. Let's work on it.