Chapter 6: Building a 311 Data Pipeline
In the previous three chapters, you learned how to use Python, Airflow, and NiFi to build data pipelines. In this chapter, you will use those skills to create a pipeline that connects to SeeClickFix and downloads all the issues for a city, and then loads it in Elasticsearch. I am currently running this pipeline every 8 hours. I use this pipeline as a source of open source intelligence – using it to monitor quality of life issues in neighborhoods, as well as reports of abandoned vehicles, graffiti, and needles. Also, it's really interesting to see what kinds of things people complain to their city about – during the COVID-19 pandemic, my city has seen several reports of people not social distancing at clubs. ...
Get Data Engineering with Python now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.