[MarkLogic Dev General] Suggestions for data masking

Geert Josten
Mon Mar 23 23:00:07 PDT 2015

Hi Joel,

I haven¹t dealt with this personally, but could ask around. I guess though
there are numerous ways to go about with this, depending on the exact
needs. The two that come to mind first:

You could create a permanent solution using Flexible Replication, which
builds on top of CPF:

You could also use MLCP copying feature together with an MLCP transform.

You already mentioned triggers and scheduled tasks, but MLCP will load
faster I think. CPF uses triggers underneath..

Kind regards,

On 3/24/15, "Joel Wilson Gunasekaran" wrote:
<joelwilson.gunasekaran at gmail.com> wrote:

>Once in a while, we refresh dataset in lower environments with production
>data for testing purposes.
>We have a requirement to mask all pii(personally identifiable
>information) data like email id, phone number, etc. in lower environments
>like DEV, QA. 
>We were thinking about having a one-time script that does the masking,
>which can be run when we do the data refresh.
>In addition to this, we also want a automated process that does this,
>like either a scheduled task or a trigger, to avoid any sensitive data
>left unmasked, accidentally.
>Can you please let me know if you have had to deal with similar cases and
>any suggestions?
