Chapter 13. Template-Driven Extraction
Template-Driven Extraction (TDE), new in MarkLogic 9, lets us put values directly into the row or triple index, without having to modify the actual document structure. We define a template that tells MarkLogic where to look for values and how to record them. We can have transformation code in the template itself, acting on a template level, rather than modifying the documents.
One of the tricks with using TDE is good use of collections. Each input source should be in a distinct collection, which enables us to write templates that target each collection individually. As with other strategies to harmonize data from different sources, we want to simplify queries by writing some simple code that maps from multiple input formats to a single query format.
TDE also supports doing data source harmonization in an agile way. When we need another piece of information in the index, we can simply update the template. While this still requires reindexing, keeping the original documents intact simplifies data governance.
Searching on Derived Data
Problem
You want to search on derived data, such as day of the week (Monday, Tuesday, …), but your data only has a date (2017-06-21).
Solution
Applies to MarkLogic versions 9 and higher
The solution is to use Template-Driven Extraction to put the derived information directly into the row or triple index. With that in mind, there are two parts to the solution: the template and the actual search.
My sample documents ...
Get MarkLogic Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.