Using an Aggregation Function via REST

David Cassel
Last updated October 24, 2012

In a recent article, I showed how to build a simple Aggregation function. That’s a good start, but the next step is figuring out what we can do with it. At some point, we’ll probably want to display the values from the function to the user, maybe as part of an analytics widget. We’ll see how to use the REST API to get the values.

It’s worth noting that while I’m exploring my user-defined aggregation function, MarkLogic has a bunch of built-in aggregate functions, too. The procedure to use those is pretty similar, except that you don’t have to specify the aggregatePath or udf attributes that you’ll see below.

Setup

The Aggregation function

The first bit of setup is to build and load the custom function. You’ll find instructions for that on my earlier post.

Data and an Index

There are a couple ways to get what we want, but we’ll need some data. I’m going to use an example where I have some xs:date data in an element called posted-date. If you don’t have more interesting data at hand, you can use this to dummy up something good enough for this exercise (create a new database or use the Documents database and make sure you have the Query Type dropdown set to XQuery):

for $i in (1 to 1000)
return
  xdmp:document-insert(
    "/content/" || $i || ".xml",
    <doc>
      <data>{$i}</data>
      <posted-date>{ fn:current-date() - xs:dayTimeDuration("P"||xdmp:random(30)||"D")}</posted-date>
    </doc>
  )

That will produce 1000 documents scattered over 30 days.

Add an element range index

Make sure you add an element range index on the posted-date element. In order to do this you can login to your Admin Interface (http://localhost:8001), select Configure -> Databases -> <your database> -> Element Range Indexes. When adding the range index use type = date, no namespace (for this example), localname = posted-date - and leave other values as the default.

REST App Server

You have data and an index, now we need a REST app server to get to it. There are many ways to setup a REST API instance - here we are going to use the REST API from Query Console via XQuery. Open a new tab in your Query Console and run the following XQuery code:

xquery version '1.0-ml';
let $config := xdmp:quote(<rest-api xmlns="http://marklogic.com/rest-api">
  <name>DMC-Learn</name>
  <database>Documents</database>
  <port>8003</port>
</rest-api>)
let $options := 
<options xmlns="xdmp:http">
  <headers>
    <content-type>application/xml</content-type>
  </headers>
  <authentication>
   <username>admin</username>
   <password>admin</password>
  </authentication>
  <data>{$config}</data>
</options>
return xdmp:http-post('http://localhost:8002/v1/rest-apis', $options);
For the rest of this post, I’ll assume that you built the app server on port 8003. Please also note that if you have an application already running on port 8003 you'll need to change the port in the code above.

Search Options

The REST API is configurable so that we can get back what we want. Search results are controlled by specifying search options. Our next step is to create some search options that will work on the posted-date range index. We’ll start with the simplest options and then tweak them.

<options xmlns="http://marklogic.com/appservices/search">
  <values name="posted-date">
    <range type="xs:date" facet="false">
      <element ns="" name="posted-date"/>
    </range>
  </values>
</options>

These options tell MarkLogic that we want to get values from the posted-date index, but don’t yet make use of the aggregation function. Now we need to tell the app server about these options. We can register the options using the REST API itself, by using an HTTP PUT method. Open a new Query Console tab and run the following XQuery code:

xquery version '1.0-ml';
let $data := xdmp:quote(<options xmlns="http://marklogic.com/appservices/search">
  <values name="posted-date">
    <range type="xs:date" facet="false">
      <element ns="" name="posted-date"/>
    </range>
  </values>
</options>)
let $options := 
<options xmlns="xdmp:http">
  <headers>
    <content-type>application/xml</content-type>
  </headers>
  <authentication>
   <username>admin</username>
   <password>admin</password>
  </authentication>
  <data>{$data}</data>
</options>
return xdmp:http-put('http://localhost:8003/v1/config/query/dow-options', $options);

Change the username and password as needed — if you use something other than the admin user, you’ll need a user with at least the rest-writer role. This posts the contents of the options $data to the REST app server. Note that the end of the URI is the place where I'm asking to put the new options.

Getting Values Via REST

At this point, we have enough that we can use the REST API to get the values posted-date values. This link:

http://localhost:8003/v1/values/posted-date?options=dow-options

shows all values.

<values-response name="posted-date" type="xs:date"
  xmlns="http://marklogic.com/appservices/search" 
  xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <distinct-value frequency="2">2017-05-09</distinct-value>
  <distinct-value frequency="1">2017-05-10</distinct-value>
  <distinct-value frequency="1">2017-05-12</distinct-value>
  ...
</values>

However, we aren’t calling the aggregation function yet. To do that, we need to change the request a bit.

Using the Aggregation Function

Now that we have a values option set up, there are two ways to apply an aggregation function to it.

Using Request Parameters

We can choose to use an aggregation function on a call-by-call basis by changing the request parameters:

http://localhost:8003/v1/values/posted-date?options=dow-options&aggregate=day-of-week&aggregatePath=native/day-of-week

Now in addition to a list of the values, we get the day-of-week function’s results:

<aggregate-result name="day-of-week">
  <map:map xmlns:map="http://marklogic.com/xdmp/map" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <map:entry key="0">
      <map:value xsi:type="xs:unsignedLong">8470</map:value>
    </map:entry>
    <map:entry key="5">
      <map:value xsi:type="xs:unsignedLong">4574</map:value>
    </map:entry>
    <map:entry key="3">
      <map:value xsi:type="xs:unsignedLong">9304</map:value>
    </map:entry>
    <map:entry key="6">
      <map:value xsi:type="xs:unsignedLong">8482</map:value>
    </map:entry>
    <map:entry key="1">
      <map:value xsi:type="xs:unsignedLong">6736</map:value>
    </map:entry>
    <map:entry key="2">
      <map:value xsi:type="xs:unsignedLong">2764</map:value>
    </map:entry>
    <map:entry key="4">
      <map:value xsi:type="xs:unsignedLong">8713</map:value>
    </map:entry>
  </map:map>
</aggregate-result>

Search Options for Aggregation

We can also set up search options so that we always use the aggregation function. We’ll change the options that we set up above so that our function gets called.

<options xmlns="http://marklogic.com/appservices/search">
  <values name="posted-date">
    <range type="xs:date" facet="false">
      <element ns="" name="posted-date"/>
    </range>
    <aggregate apply="day-of-week" udf="native/day-of-week" />
  </values>
</options>

We tell MarkLogic about the revised option the same way we told it in the first place: a PUT message.

xquery version '1.0-ml';
let $data := xdmp:quote(<options xmlns="http://marklogic.com/appservices/search">
  <values name="posted-date">
    <range type="xs:date" facet="false">
      <element ns="" name="posted-date"/>
    </range>
    <aggregate apply="day-of-week" udf="native/day-of-week" />
  </values>
</options>)
let $options := 
<options xmlns="xdmp:http">
  <headers>
    <content-type>application/xml</content-type>
  </headers>
  <authentication>
   <username>admin</username>
   <password>admin</password>
  </authentication>
  <data>{$data}</data>
</options>
return xdmp:http-put('http://localhost:8003/v1/config/query/dow-options', $options);

Values with Aggregation

Now we can make the same call as we did above, but in addition to the values, we’ll also get the aggregation function results.

http://localhost:8003/v1/values/posted-date?options=dow-options

Aggregation without the Values

You may want to get just the results of the aggregation function without the full list of values. The REST API supports that with the view parameter. Specifying “view=aggregate” skips the full listing of the values.

http://localhost:8003/v1/values/posted-date?options=dow-options&aggregate=day-of-week&aggregatePath=native/day-of-week&view=aggregate

Well, that’s it for this instalment of exploring the REST API.

This tutorial first appeared as a post on David's blog. Thanks to Dave for letting us republish it here!

Comments