Before we start querying the Redshift database, we will first need to upload some same data to it. For this particular scenario, we are going to use a small subset of HTTP request logs that originated from a web server at the NASA Kennedy Space Center in Florida. This data is available for public use and can be downloaded from here: http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html.
The log file essentially contains the following set of columns:
- Host: The host that is making the web request to the web server. This field contains fully qualified hostnames or IP addresses as well.
- Timestamp: The timestamp of the particular web request. The format is DAY MON DD HH:MM:SS YYYY. This timestamp uses a ...