Chapter 13. Unique Uses of Drill

As you have seen, Apache Drill is capable of querying all kinds of data, large and small, in a variety of different systems. This chapter highlights some examples of unique use cases in which Drill has made complex analysis easy. The first example demonstrates how to use Drill’s suite of geometric functions as well as the image metadata format plug-in to identify photos taken within a geographic region. Next, you will see a demonstration of a situation in which writing a format plug-in was very helpful, for working with Excel files. Finally, we cover several use cases in which analysts greatly expanded Drill’s functionality by creating UDFs.

Finding Photos Taken Within a Geographic Region

In Drill 1.14, two features were added that made this use case possible: the ability to analyze Exchangeable Image File (EXIF) metadata,1 and a collection of geographic information system (GIS) functions allowing all kinds of functionality, including the ability to search within defined geographic areas or polygons as well as the ability to create these polygons. Drill’s GIS functionality largely follows the GIS functionality found in PostGIS.2

The first thing you need to do is to extract the fields that contain the latitude and longitude in the EXIF metadata. The example that follows demonstrates how to access the geocoordinates of an image in Drill (note that not all images will contain these fields):

SELECT t.GPS.GPSLatitude AS lat, t.GPS.GPSLongitude AS ...

Get Learning Apache Drill now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.