Chapter 14. Redaction

MarkLogic’s redaction feature hides information during export. A major benefit of this is that QA and development teams can work with data that is very similar to production, but private data is removed or obscured.

MarkLogic 9 ships with several built-in redaction functions; the recipes in this chapter explore some cases not covered by those.

Redacting Credit Card Numbers, Replacing with Digits

Problem

You want to use production data for development and testing, but the data contains credit card numbers. You need to redact these values before copying them to other environments, while making sure that their form (number of digits) remains the same.

Solution

Applies to MarkLogic versions 9 and higher

For sample data, I whipped up some JSON documents that looked like this:

{
  name: "Charlie",
  ccNum: "6011759130364395557"
}

As I loaded sample documents, I put them into a collection called “people.”

The goal is to replace each ccNum with a string consisting of digits, while still having the same length.

None of the built-in redaction functions did exactly what I wanted, so I wrote a custom one. Here’s the function that implements the redaction:

'use strict';

const MAX =
  Number.MAX_VALUE.toString()
    .replace(/\d\.(\d+)e\+\d+/, "$1").length;

// Generate a string of random digits with the
// specified length
function generateNDigitStr(n) {
  // because .toString will drop trailing zeros:
  let padding = '0'.repeat(MAX);

  let more = n; // how many more characters do ...

Get MarkLogic Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.