Chapter 11. Input Transformations
Data modeling is an iterative process with MarkLogic—we load data as is and then revise it over time. Sooner or later, however, we’ll need to modify the structure of the data. The recipes in this chapter show how to change documents, using tools like MarkLogic Content Pump (MLCP), REST API transforms, or CORB2, an open source bulk processing tool.
Changing Date Format
Problem
You’re loading documents, but the data have dates in a nonstandard format. Each document has multiple dates. You want to fix them during ingest.
Solution
Applies to MarkLogic versions 8 and higher
Here’s the code for an MLCP transform that will fix dates in a specified list of JSON properties in newly ingested JSON documents. This code can also be used with REST input transforms or CORB2 jobs.
Save this content in dateTransform.sjs. Add it to your modules database, as described in the documentation.
// Recurse through a JSON document, applying
// function f to any JSON property whose property
// name is in the array keys.
function
applyToProperty
(
obj
,
keys
,
f
)
{
for
(
var
i
in
obj
)
{
if
(
!
obj
.
hasOwnProperty
(
i
))
{
continue
;
}
else
if
(
typeof
obj
[
i
]
===
'object'
)
{
applyToProperty
(
obj
[
i
],
keys
,
f
);
}
else
if
(
keys
.
indexOf
(
i
)
!==
-
1
)
{
obj
[
i
]
=
f
.
call
(
this
,
obj
[
i
]);
}
}
}
function
fixDate
(
value
)
{
return
new
Date
(
value
).
toISOString
().
substring
(
0
,
10
);
}
// This is the MLCP transform function. Fix any date with the
// property name(s) specified by context.transform_param. ...
Get MarkLogic Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.