In the previous section, we successfully cleaned our data, now neatly accessible in clean_df_geo DataFrame. If you run into any problems with the data cleaning process, you can just go ahead and load the dataset from scratch by using the clean_df_geo.tsv file provided in this chapter's support files (https://github.com/PacktPublishing/Julia-Programming-Projects/blob/master/Chapter08/data/clean_df_geo.tsv.zip). In order to load it, all you have to do is run the following:
julia> using CSV julia> clean_df_geo = CSV.read("clean_df_geo.tsv", delim = '\t', nullable = false)
So we want to identify the areas with the highest density of businesses. One approach is to use unsupervised machine learning to ...