In my previous post Beer Crawl, I deployed a web crawler to extract information describing over 300 craft beers in British Columbia. However after further investigation I realized that the dataset I had collected unfortunately had a few drawbacks. Perhaps the main one is that it describes the best beers in BC, and as a result their descriptions, ratings and reviews are more or less homogeneous, as this short visualization exercice will soon show.
Initialization and data pre-processing¶
Let's first load the standard libraries that we will use throughout and then proceed to clean up the raw data: