Problem

Consider the two JSON documents below:

{
    "parties": [
        {
            "id": 0,
            "role": "CN",
            "name": "name1"
        },
        {
            "id": 1,
            "role": "SH",
            "name": "name2"
        }
    ]
}
{
    "parties": [
        {
            "id": 0,
            "role": "CN",
            "name": "name2"
        },
        {
            "id": 1,
            "role": "SH",
            "name": "name1"
        }
    ]
}

Notice the anonymous “party” objects in the parties array? Is this unusual for a JSON structure? Definitely not. It is very common to have arrays of anonymous objects in JSON documents. However when querying such documents in MarkLogic, some special care is needed to account for the scoping rules.

Consider this question:

“Find the documents where name is ‘name1’ AND role is ‘CN’ for within the same anonymous object?”

We want the name and role evaluated at the same object level, we expect the answer of 1.json, but because of the anonymous objects, getting the right result is tricky as we will see.

A cts.query matching name = ‘name1’ and role = ‘CN’ will return both the documents, no matter whether the query is filtered or unfiltered, because MarkLogic’s term and range indexes have no object name to associate with the key/value pairs inside the object. Thus, MarkLogic indexes key/value pairs as if they did not exist in objects.

cts.uris("",null,cts.andQuery(
                              [
                               cts.jsonPropertyValueQuery("name","name1"),
                               cts.jsonPropertyValueQuery("role","CN" )
                              ]
                             )
        )

==> 1.json
    2.json

Solution

Applies to MarkLogic versions 9.0+

How can we formulate a query that returns the correct answer? There are two preferred options in MarkLogic that leverage its indexing capabilities for fast, accurate results.

Option 1: Enable Position Indexes

Position indexes add details to where a word is found inside the document. This information can be used by the query engine to efficiently execute positional based queries, which the anonymous object requires.

Enable ‘element word positions’ and ‘element value positions’ using Admin Console. By default, these are off to save space, but they help in cases like this. An added benefit will be faster evaluation of “NEAR” queries as well.

 

Use the jsonPropertyScopeQuery() to force object scoping.

var query1 = cts.jsonPropertyScopeQuery("parties",
cts.andQuery([cts.jsonPropertyValueQuery('name','name1'),
cts.jsonPropertyValueQuery('role','CN')]))

The impact of turning off/on the above configurations can be observed in the query plan using cts.plan.

var query1 = cts.jsonPropertyScopeQuery("parties",
cts.andQuery([cts.jsonPropertyValueQuery('name','name1'),
cts.jsonPropertyValueQuery('role','CN')]))
cts.uris("", null, query1)

==> 1.json

Note: Enabling the above position indexes increases the database size. However, the queries will be faster and accurate. The results are determined using the indexes alone. You can check this by running the query in filtered and unfiltered modes.

Option 2: Use Optic API

You can unwind the anonymous array using a TDE template as shown below.

'use strict';
declareUpdate();
var tde = require("/MarkLogic/tde.xqy");
var t1= xdmp.toJSON(
{
        "template": {
            "description": "Extract parties Information",
            "context": "/parties/name",
            "rows": [
                {
                    "schemaName": "party",
                    "viewName": "partyRole",
                    "viewLayout": "sparse",
                    "columns": [
                      {
                            "name": "uri",
                            "scalarType": "string",
                            "val":"xdmp:node-uri(.)", 
                            "nullable": false,
                            "default": "'NA'",
                            "invalidValues": "ignore"
                        }, 
                        {
                            "name": "name",
                            "scalarType": "string",
                            "val":"../name", 
                            "nullable": false,
                            "default": "'NA'",
                            "invalidValues": "ignore"
                        },    
                        {
                            "name": "role",
                            "scalarType": "string",
                            "val": "../role",
                            "nullable": false,
                            "default": "'NA'",
                            "invalidValues": "ignore"
                        }
                    ]
                }
            ]
        }
    }
);
tde.templateInsert("party-view.json", t1)

With this template in place, MarkLogic will build a table with the URI, name, and role for each of the anonymous objects. This table is then easily queried with the Optic API query shown below.

const op = require('/MarkLogic/optic');
op.fromSQL("select uri as doc from party.partyRole where name = 'name1' AND role='CN'").result()

{"doc":"1.json"}

Required Privileges: None

Required Indexes:

  • element word positions
  • element value positions

Discussion

Many enterprises will have a domain data model where a ‘type’ attribute dictates the actual behavior of the entity. Few examples would be:

  • employeeType = ‘permenant’ /‘temporary’ in a Employee entity
  • partyType = ‘customer’, ‘prospect’, ‘ex-customer’ in a Party entity
  • partyRole = ‘Consignee’ or ‘Shipper’ in a Shipment entity

It is also possible that there are multiple of such entities in the model. For example, a Shipment entity will have a ‘Consignee’ as well as a ‘Shipper’.

When such data models are serialised into JSON documents, these entities will come up as ‘anonymous array of objects’. Querying documents with a specific condition on attributes in anonymous objects is very common. (Ex. ”Find the all ‘permenant’ employees’ in the system”). Now you know how to query such documents in MarkLogic.

This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.