Chapter 3. Search Service

So far, given a dataset, we are able to gather the required metadata details to correctly interpret the properties and meaning of the attributes. The next challenge is, given thousands of datasets across enterprise silos, how we effectively locate the attributes required to develop the insight. For instance, when developing a revenue dashboard, how do we locate datasets of existing customers, products they use, pricing and promotions, activity, usage profiles, and so on? Further, how do we locate artifacts such as metrics, dashboards, models, ETLs, and ad hoc queries that can be reused in building the dashboard? This chapter focuses on finding the relevant datasets (tables, views, schema, files, streams, and events) and artifacts (metrics, dashboards, models, ETLs, and ad hoc queries) during the iterative process of developing insights.

A search service simplifies the discovery of datasets and artifacts. With a search service, data users express what they are looking for using keywords, wildcard searches, business terminology, and so on. Under the hood, the service does the heavy lifting of discovering sources, indexing datasets and artifacts, ranking results, ensuring access governance, and managing continuous change. Data users get a list of datasets and artifacts that are most relevant to the input search query. The success criteria for such a service is reducing the time to find. Speeding up time to find significantly improves time to insight, as ...

Get The Self-Service Data Roadmap now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.