Walking Among the JSON Trees

by Paxton Hare

Many of us know how to recursively transform an XML structure using the typeswitch expression. In fact, the MarkLogic Docs have a great example. There is even a blog post about it by Dave Cassel. But what if you want to recursively transform JSON? I set out to write a reusable function to do just that. Allow me to "walk" you through it.

 

JSON Object Model

First let's review some basics. Understanding the JSON object model is crucial to our understanding of how to transform JSON. The MarkLogic Docs explain it quite well. It boils down to a handful of node types for JSON: document-node, object-node, array-node, number-node, boolean-node, null-node, and text. I encourage anyone working with JSON to read the Working With JSON chapter for a full understanding of the concepts. These node types will allow us to use the typeswitch expression to recursively walk our JSON objects.

 

Typeswitch Expression

From the XQuery 3.0 W3c spec: The typescript expression chooses one of several expressions to evaluate based on the dynamic type of an input value. A super barebones typeswitch for JSON would look like this.

Given a node $n, the typeswitch will execute the code in the case statement matching the node's type.

Now Add some Recursion

Let's add some recursion to the mix.

So what exactly is happening here?

We iterate over the node or nodes coming in

we then use a typeswitch to branch to code that handles each type of json node

for document-node we simply pass through the 1st child back to the function.

for object-node we construct a new object and then recursively add the key/value pairs back into it.

For an array-node we create a new array and add back the recursively transformed values.

For the rest of the value types we can do the exact same thing. Just add it into our object.

And then we have our failsafe. It simply returns the node.

A few tricky parts you might notice...

This is tricky for two reasons. Number one is that the json:object() method is really returning a map. So we use the map:* functions to operate on it. The second reason this code is tricky is way it is written as a sequence with $o. The map:put call returns the empty sequence. Sequences collapse in XQuery and thus:

is equivalent to

is equivalent to

is equivalent to

So this code is effectively returning $o after first adding a value to the map.

The Visitor

At this point we have a function that walks the tree and simply makes a copy of the node. But how do we make changes? We can use the Visitor Pattern to call into a function to make the changes. This will let us isolate our alteration code from our tree walking code. Here is the code with the visitor pattern in place.

What's going on with this new version?

This version adds the $call-visitor closure. $call-visitor is simply a convenience function to allow us to build a map and call the user supplied visitor function.

for each node type we have added a call to $call-visitor. This gives the user supplied visitor function a chance to alter the key or value in the map. We then assign the values from the map into $key and $value and use them just like we did before.

The example visitor function above is returning () and thus is merely making a copy of the JSON node. To transform the JSON node you would want to edit the "key" and "value" entries in the $output map.

Here is an example usage that modifies the JSON during the walk.

Where Do I get this in a Usable Form?

If you want to play around with this in QConsole, you can grab a QConsole Workspace file here. Simply open up QConsole and choose WorkSpace => Import. Then browse to the json-tree-walker.xml file that you downloaded.

I also created a Github Project containing this library. You can alternatively use Joe Bryan's awesome mlpm utility to install the json-tree-walker module into your code.

Comments