The revolutionary Optic API blends the relational world with rich NoSQL document features by providing the capability to perform joins and aggregates over documents. One of the enabling features, Template Driven Extraction, makes it possible to create a relational lens over documents stored in MarkLogic by using templates to specify the parts of a document that make up a row in a view. You can access that data using either SQL or the Optic API.

The Optic API uses row, triple, and/or lexicon lenses over documents, and is powered by the new row index. With the Optic API, you can use full-text document search as a filter, perform row operations to join and aggregate data, and retrieve or construct documents on output. It’s available in all of MarkLogic’s supported languages: JavaScript, XQuery, REST, Java, and soon Node.js. Each implementation adopts language-specific patterns so it feels conceptually familiar if you have relational experience and syntactically natural given your existing programming knowledge.

Here is an example of a SQL query converted to using the Optic API in JavaScript:

select UserLocation, count(QnATitle) as questionCount
from Samplestack.Contributors
inner join Samplestack.QnA on Asker=ContributorUserName
group by UserLocation
limit 50
op.fromView('Samplestack', 'Contributors')
  .joinInner(op.fromView('Samplestack','QnA'), op.on('Asker','ContributorUserName'))
  .groupBy('UserLocation', op.count('questionCount', 'QnATitle'))
  .limit(50)
  .result();

We’ll get into more detail regarding the similarities as we go along.

Why it Matters

The MarkLogic Optic API makes it possible to perform powerful NoSQL joins and aggregates across documents. Developers can use document search to retrieve rows projected from documents, join documents as columns within rows, and construct document structures dynamically — all performed efficiently within the database and accessed programmatically from your application.

NoSQL databases bias towards denormalization of data, where documents represent real world entities. This data model supports development and search — all information about an entity is conveniently in one place for efficient retrieval and update. But in some cases your entities are naturally in normalized form and it is better to keep them that way.

Or, your questions rely on the relationships across entities. The flexible Optic API allows you to use your data as-is, producing denormalized data on read for use in your applications. The API makes it possible to harness the full power of MarkLogic’s document and search capabilities using familiar syntax, incorporating common SQL concepts, regardless of the underlying shape of your data.

XQuery and JavaScript Examples

Environment Setup

For the examples in this section, you need MarkLogic 9 installed and configured. Sample queries use data from the Samplestack project. You can use Samplestack’s provided gradle scripts to secure and load the database.

To make use of Samplestack’s set up process:

  • Make sure you have git and Java 8 installed and on your path
  • Run git clone https://github.com/marklogic/marklogic-samplestack
  • cd marklogic-sampletack/appserver/java-spring
  • Create and configure the database by running ./gradlew assemble (it’s okay if some of the tests fail; note that you must have port 8006 free before setting up Samplestack)
  • Load the data set using ./gradlew dbload

By default this process will secure MarkLogic with the username admin, password admin. If you have already secured MarkLogic, modify appserver/java-spring/gradle.properties before running the gradle scripts using your own admin credentials.

Alternatively, you can view and load the raw data yourself (download and extract), by running the following script requiring MLCP:

/path/to/mlcp.sh import -host localhost -port 8000 -username admin -password admin -input_file_path seed-data -output_uri_replace ".*seed-data,''"

This will load the documents into the “Documents” database.

Example 0: Templates for Samplestack Data

We’ll start by surfacing key information from our data set by creating templates. Template Driven Extraction (TDE) indexes documents as-is in the row index for scalable aggregates and joins. Read more about template driven extraction for details on laying a relational lens over your data for use in the Optic API.

Samplestack seed data comes from Stack Overflow. The data set consists of two types of JSON documents, which are modeled on real-world entities and activities:

  • One type of document represents Question and Answer discussion threads (sample QnA doc)
  • The other document type represents those who contribute to QnA threads by asking and answering questions, like a user profile (sample Contributor doc)
QnA Document versus Contributor Document

There can be a many-to-many relationship between QnA and Contributor documents. The document data model is well suited to Samplestack, which prioritizes searchability, so users can find the QnA thread most relevant to their needs. The application also supports updates, and the flexible document model easily adapts to capture additional answers, comments, or other data that might later be part of a QnA thread.

By putting key fields, like User Reputation, into the row index, we’re adding a relational lens over our documents. For example, we can quickly aggregate Reputation across Users. Data in rows can also be efficiently joined with other distributed, yet related, data — such as QnA threads where the Users participate.

Template-Driven Extraction

The included examples make use of a few different views. Open Query Console to define lenses over the sample data.

Download and import the Samplestack-TDE workspace and follow the instructions to load the templates. Please make sure to select the samplestack database as the workspace defaults to “Documents” database.

Use JavaScript to load the user template:

'use strict'
declareUpdate();
var tde = require("/MarkLogic/tde.xqy");
// Load the user template for user profile rows
var template = xdmp.toJSON(
{
  "template":{
    "context":"/com.marklogic.samplestack.domain.Contributor",
    "rows":[
      {
        "schemaName":"Samplestack",
        "viewName":"Contributors",
        "columns":[
          {
            "name":"ContributorUserName",
            "scalarType":"string",
            "val":"userName"
          },
          {
            "name":"UserReputation",
            "scalarType":"long",
            "val":"reputation",
            "nullable": true
          },
          {
            "name":"UserLocation",
            "scalarType":"string",
            "val":"location",
            "nullable": true,
            "invalidValues":"ignore"
          },
          {
            "name":"UserURI",
            "scalarType":"string",
            "val":"xdmp:node-uri(.)"
          }
        ]
      }
    ]
  }
}
);
tde.templateInsert("/ss-ea/user-TDE.tdej", template);

Follow the same process to load QnA, Answer, and QTag templates by running the included tabs. Using tde.templateInsert will upload the provided template to the Schemas database with the tde collection and automatically trigger population of the row index based on content in MarkLogic.

In the provided workspace, you’ll also find a couple of tabs to validate and explore the views created over your documents. Here is a look at the three relational lenses we are using over our two Samplestack document types:

Now it’s time to use the Optic API to query the data with the exposed relational lenses. For the best experience, use the language driver you are most comfortable with:

  • The MarkLogic Java Client API supports Optic query construction and row result processing — see detailed Java Optic examples here.
  • In the following section, you will find a detailed overview of the Optic API, expressed in both Server-side JavaScript (SJS) and XQuery.
  • The REST API supports Optic queries and is great for those who want to build their own language wrappers. See REST API Examples farther down the page for tips on getting started.
Example 1: Row Query

To explore the Optic API’s capabilities, you can paste these examples into Query Console. Choose “JavaScript” or “XQuery” as the query type and use the “samplestack” database.

Begin using the Optic API by performing a basic query on a view over documents. Querying the view will return rows. Run the following in Query Console to Query User Rows:

'use strict'
const op = require('/MarkLogic/optic');
const minRep = 5000;
op.fromView('Samplestack', 'Contributors')
  .select(['ContributorUserName', 'UserReputation', 'UserLocation'])
  .orderBy('UserReputation')
  .where(op.gt(op.col('UserReputation'), minRep))
  .limit(25)
  .result();
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
declare default function namespace "http://marklogic.com/optic";
declare option xdmp:mapping "false";
let $min-rep := 5000
return
  from-view("Samplestack", "Contributors")
  => select(("ContributorUserName", "UserReputation", "UserLocation"))
  => order-by("UserReputation")
  => where(gt(col("UserReputation"), $min-rep))
  => limit(25)
  => result()

That’s it! You’ve just retrieved your first row set. In Query Console, you can choose to view your results as a SQL table, JSON, or text. As JSON, your results should look something like this:

{
  "Samplestack.Contributors.ContributorUserName": "souser499497@example.com", 
  "Samplestack.Contributors.UserReputation": 5001, 
  "Samplestack.Contributors.UserLocation": "Romania"
}
{
  "Samplestack.Contributors.ContributorUserName": "souser543693@example.com", 
  "Samplestack.Contributors.UserReputation": 5003, 
  "Samplestack.Contributors.UserLocation": "Warren, MI"
}
{
  "Samplestack.Contributors.ContributorUserName": "souser358163@example.com", 
  "Samplestack.Contributors.UserReputation": 5005, 
  "Samplestack.Contributors.UserLocation": "Karachi, Pakistan"
}

In this example, a query is constructed to return rows fromView “Contributors”, in the “Samplestack” schema. It is specifically retrieving the columns ContributorUserName, UserReputation, and UserLocation. Resulting columns are qualified by “SchemaName.ViewName”.

Additional functions modify the result set, and should look familiar to those with SQL experience:

  • orderBy sorts the row results by the UserReputation column
  • where filters the results to rows where the UserReputation column is greater than the specified minimum reputation level
  • limit constrains the number of rows returned (to 25 in this case)

The final step is to execute the query using result. Until this point you are only building the query plan.

Note the import of the /MarkLogic/optic JavaScript library at the beginning of the code sample, using the convention op.

The SQL equivalent for this query is provided for reference:

SELECT ContributorUserName, UserReputation, UserLocation
FROM Samplestack.Contributors
WHERE UserReputation > 5000
ORDER BY UserReputation
LIMIT 25 OFFSET 0;

You will notice the names and purposes of the Optic API operations are familiar from SQL, but have the benefit of language integration. Not only is the idiomatic, chainable function style familiar to JavaScript programmers, but the ability to incorporate variables makes data retrieval a safer, smoother experience. The use of the minRep constant in this example demonstrates the benefits of using the Optic API over using Common Table Expressions and string concatenation to submit a SQL query from JavaScript.

Take a minute to explore the other included Query tab to Query QnA Rows.

In these basic examples, each of the rows returned is contained within a single document. The following examples introduce increasingly more powerful capabilities, including the opportunity to return rows that span documents.

Example 2: Aggregates and Grouping

Use the MarkLogic Optic API to conveniently perform aggregate functions on values across documents. In this example, several operations are performed to get a sense of basic statistics about the Samplestack User data.

Run the following query to aggregate user stats:

'use strict'
const op = require('/MarkLogic/optic');

op.fromView('Samplestack', 'Contributors')
.groupBy(null, [
op.count('userCount', 'contributorusername'),
op.min('minReputation', 'userReputation'),
op.avg('aRep', 'userReputation'),
op.max('maxReputation', 'userReputation'),
op.count('locationCount', 'userlocation')
])
.select(['userCount', 'minReputation',
op.as('avgReputation', op.math.trunc(op.col('aRep'))),
'maxReputation',
op.as('locationPercent',
op.fn.formatNumber(op.xs.double(
op.divide(op.col('locationCount'), op.col('userCount'))),'##%'))
])
.result();
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
import module namespace ofn="http://marklogic.com/optic/expression/fn" at "/MarkLogic/optic/optic-fn.xqy";
import module namespace omath="http://marklogic.com/optic/expression/math" at "/MarkLogic/optic/optic-math.xqy";
import module namespace oxs="http://marklogic.com/optic/expression/xs" at "/MarkLogic/optic/optic-xs.xqy";

declare option xdmp:mapping "false";

(:
Get basic stats about some of the numeric values in Samplestack:
* Users - count, reputation range, sparseness of location
:)

op:from-view("Samplestack", "Contributors")
=> op:group-by((), (
op:count("userCount", "contributorusername"),
op:min("minReputation", "userReputation"),
op:avg("aRep", "userReputation"),
op:max("maxReputation", "userReputation"),
op:count("locationCount", "userlocation")
))
=> op:select(("userCount", "minReputation",
op:as("avgReputation", omath:trunc(op:col("aRep"))),
"maxReputation",
op:as("locationPercent", ofn:format-number(oxs:double(
op:divide(op:col("locationCount"), op:col("userCount"))), "##%"))
))
=> op:result()

The first parameter in the groupBy is empty to groupBy all rows, based on no particular column nor value. A set of aggregates are defined to count the users, calculate the minimum/maximum/average values for Reputation, and compute the fraction of users who have specified a location in their profile. All of these aggregate functions are in the Optic library and the operations must be qualified. In the select statement, further processing is done to format the output: average reputation is truncated and the location fraction is displayed as a percentage.

The Optic API has access to MarkLogic’s other function libraries such as math and fn. Calls to built-in libraries must be prefixed using the convention op so calls are deferred and executed with the rest of the query plan as optimized by the engine. When passing column values to built-in libraries as parameters, use the col identifier, especially to distinguish from strings.

The next query performs similar operations in order to aggregate QnA stats. One difference from the previous query is that a grouping parameter – column name ‘Accepted’ – is applied to the groupBy statement to separate summary statistics for questions that have an accepted answer from those questions that do not have an accepted answer.

'use strict'
const op = require('/MarkLogic/optic');

op.fromView('Samplestack', 'QnA')
.groupBy('Accepted', [
op.count('questionCount', 'QnATitle'),
op.min('earliestDate', 'QAskDate'),
op.max('latestDate', 'QAskDate'),
op.avg('avgNumAnswers', 'AnswerCount'),
op.avg('avgQuestionVotes', 'QVoteCount')
])
.result();
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
declare option xdmp:mapping "false";

(:
Get basic stats about some of the numeric values in Samplestack:
* QnA Threads - count, date range, % answered, avg # answers, vote count/itemtally
:)

op:from-view("Samplestack", "QnA")
=> op:group-by("Accepted", (
op:count("questionCount", "QnATitle"),
op:min("earliestDate", "QAskDate"),
op:max("latestDate", "QAskDate"),
op:avg("avgNumAnswers", "AnswerCount"),
op:avg("avgQuestionVotes", "QVoteCount")
))
=> op:result()
Example 3: Row Joins

This example joins Question & Answer information with related user data to return a richer result. That is, more data elements (columns) are returned than are stored in the QnA document itself – this is denormalized data-on-read, in action!

Run the following query to join rows in order to answer the question, “What is the reputation of the asker for the highest voted questions?”:

'use strict'
const op = require('/MarkLogic/optic');

const qs = op.fromView('Samplestack','QnA')
const users = op.fromView ('Samplestack', 'Contributors')

qs.orderBy(op.desc('QVoteCount'))
.limit(10)
.joinInner(users, op.on('Asker','ContributorUserName'))
.orderBy(op.desc('QVoteCount'))
.select(['QnATitle', 'QVoteCount', 'Asker', 'UserReputation'])
.result();
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
declare option xdmp:mapping "false";

(:
Join 2 types of rows - Users & Questions
What is the reputation of the asker for the highest voted questions?
:)

let $qs := op:from-view("Samplestack", "QnA")
let $users := op:from-view("Samplestack", "Contributors")
return $qs
=> op:order-by(op:desc("QVoteCount"))
=> op:limit(10) (: only interested in the top 10 :)
=> op:join-inner($users, op:on("Asker", "ContributorUserName"))
=> op:order-by(op:desc("QVoteCount"))
=> op:select(("QnATitle", "QVoteCount", "Asker", "UserReputation"))
=> op:result()

The questions (QnA) view is joined with the users (Contributors) view on user name– which is called “Asker” and “ContributorUserName” in the respective source views. This value is used as a key to join related information. The inner join yields one output row set that concatenates one left row and one right row for each match between the keys in the left and right row sets. An inner join is a type of filter. If there is no user profile row corresponding to the question document, fewer than 10 results will be returned.

The Optic API is purposefully flexible about the ability to specify the sequence and frequency of operations. orderBy is deliberately used twice in this example. Because we are only interested in the 10 most voted questions, we can filter (order+limit) so we only join on the rows of interest. Because relational joins are not guaranteed to preserve order, we sort the final output. desc specifies the sort order. You can try removing the second `op:order-by`/`op.orderBy` to see how it affects the final result.

SQL to Optic 101

List available tables

Almost all activities with a database start with exploration of what tables are available and how those tables look like.

Here’s an example of that activity in SQL:

SHOW DATABASES
USE database_name
SHOW TABLES
DESC table_name

There is currently no direct counterpart for displaying the columns of a table. But the below query will provide a good background regarding available views:

xdmp.invokeFunction(
  function() {
    return fn.collection('http://marklogic.com/xdmp/tde')
  }
  ,
  {'database':xdmp.schemaDatabase()}
)
  function() {
    fn:collection('http://marklogic.com/xdmp/tde')
  }
  ,
  <options xmlns="xdmp:eval">
    <database>{xdmp:schema-database()}</database>
  </options>
)
Explore sample records
SELECT * 
FROM samplestack.contributors 
LIMIT 10
'use strict' 
const op = require('/MarkLogic/optic'); 

op.fromView("samplestack", "contributors") 
  .limit(10) 
  .result()
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

op:from-view("samplestack", "contributors")
  =>op:limit(10)
  =>op:result()

The sample result would be as follows:

Exploring sample records results

In JSON:

[
  {
    "Samplestack.Contributors.ContributorUserName": "souser1730908@example.com", 
    "Samplestack.Contributors.UserReputation": 1108, 
    "Samplestack.Contributors.UserLocation": "London, Canada", 
    "Samplestack.Contributors.UserURI": "/com.marklogic.samplestack.domain.Contributor/sou1730908.json"
  }
,
... 
]

Hrm… what if I want those column names to be the key? That is, without the schema and table prefix. I don’t really want my client to keep handling the schema and table names with each request.

Using Optic API:

'use strict'
const op = require('/MarkLogic/optic');
op.fromView("samplestack", "contributors", "")
  .limit(10)
  .result()
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

op:from-view("samplestack", "contributors","")
  =>op:limit(10)
  =>op:result()

Resulting in:

In JSON:

[
  {
    "ContributorUserName": "souser1730908@example.com", 
    "UserReputation": 1108, 
    "UserLocation": "London, Canada", 
    "UserURI": "/com.marklogic.samplestack.domain.Contributor/sou1730908.json"
  }
,
... 
], 
...
]

Now that we have some basic syntax going, let’s explore the data some more.

Some Basic Queries

SELECT contributorusername, userreputation 
FROM samplestack.contributors 
WHERE userlocation = 'London' 
LIMIT 10
const op = require('/MarkLogic/optic');xquery version "1.0-ml";

op.fromView("samplestack", "contributors", "")
  .where(op.eq(op.col('userlocation'), 'London'))
  .select((op.col('contributorusername'), 'userreputation'))
  .limit(10)
  .result()
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy"; 

op:from-view("samplestack", "contributors", "")
  =>op:where(op:eq(op:col('userlocation'), 'London'))
  =>op:select((op:col('contributorusername'), 'userreputation'))
  =>op:limit(10)
  =>op:result()

The sample result would be:

Results of running basic queries

Some things that are worth noting above:

  1. We need to specify fields using ‘op:col’ (or equivalent: ‘op:view-col’, ‘op:schema-col’) on where clauses.
  2. Select clauses can use ‘op:col’ or simple string that matches the view’s columns.
  3. Column names, when used on the various clauses are case insensitive. But the result will always use character case as specified in the template by default.
Multiple and grouped queries
SELECT * 
FROM samplestack.contributors 
WHERE userlocation = 'London' AND userreputation <= 1000 OR userlocation = 'London, Canada' AND userreputation >= 1000 
LIMIT 10
const op = require('/MarkLogic/optic');
op.from-view("samplestack", "contributors", "")
  .where(
    op.or([
      op.and([
        op.eq(op.col('userlocation'), 'London')
        ,
        op.le(op.col('userreputation'), 1000)
      ])
      ,
      op.and([
        op.eq(op.col('userlocation'), 'London, Canada')
        ,
        op.ge(op.col('userreputation'), 1000)
      ])
    ])
  )
.limit(10)
.result()
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
op:from-view("samplestack", "contributors", "")
  =>op:where(
    op:or((
      op:and((
        op:eq(op:col('userlocation'), 'London')
        ,
        op:le(op:col('userreputation'), 1000)
      ))
      ,
      op:and((
        op:eq(op:col('userlocation'), 'London, Canada')
        ,
        op:ge(op:col('userreputation'), 1000)
      ))
    ))
  )
=>op:limit(10)
=>op:result()

The sample result would be:

Result of multiple and grouped queries

Now that we have a good grasp of our data, let’s try something a little bit more challenging. Let’s try to look for the top ‘asker’ per month.

Function on queries
SELECT asker, month, count(1) as qcount 
FROM (SELECT asker, year(qaskdate)|| '-' || right('0'||month(qaskdate), 2) as month   FROM samplestack.qna)
GROUP BY asker, month 
ORDER BY qcount desc 
LIMIT 10
const op = require('/MarkLogic/optic');
op.fromView("samplestack", "qna")
  .groupBy(
    [
      op.col('asker')
      ,
      /*
       * NOTE: we do not need to do a separate ‘require’
       * supported fn, math and cts functions are part of the sjs library module.
       */
      op.as('month', op.fn.formatDateTime(op.col('qaskdate'), '[Y0001]-[M01]'))
    ]
    ,
    op.count('qcount')
  )
.orderBy(op.desc('qcount'))
.limit(10)
.result()
xquery version "1.0-ml";
(: The base module required to use Optic API :)
import module namespace op="http://marklogic.com/optic"
    at "/MarkLogic/optic.xqy";
(: The module required for functions, e.g. fn:concat, fn:format-dateTime, etc. :)
import module namespace ofn="http://marklogic.com/optic/expression/fn"
    at "/MarkLogic/optic/optic-fn.xqy";
(: The module required for type-casting values, e.g. xs:long, xs:dateTime :)
import module namespace oxs="http://marklogic.com/optic/expression/xs"
    at "/MarkLogic/optic/optic-xs.xqy";
(: 
: A similar set of libraries is available for math, sem and other namespaces
: Please check your <install>/Modules/MarkLogic/optic/ folder for a complete list.
:)

op:from-view("samplestack", "qna")
  =>op:group-by(
    (
      op:col('asker')
      ,
      op:as('month', ofn:format-dateTime(op:col('qaskdate'), '[Y0001]-[M01]'))
    )
    ,
    op:count('qcount')
  )
=>op:order-by(op:desc('qcount'))
=>op:limit(10)
=>op:result()

The above query makes use of supported functions in MarkLogic. XQuery does not support child modules, so you need to do multiple imports. SJS, on the other hand, embeds these functions as part of the Optic library module. These functions can be referenced in a similar fashion, for example fn.formatDateTime is referenced as op.fn.formatDateTime.

Note: Do NOT try to use the XQuery modules on SJS. It will not function well.

The list of supported SQL functions is available here while the list of supported Optic API functions is available here.

version "1.0-ml";
(: The base module required to use Optic API :)
import module namespace op="http://marklogic.com/optic"
        at "/MarkLogic/optic.xqy";
(: The module required for functions, e.g. fn:concat, fn:format-dateTime, etc. :)
import module namespace ofn="http://marklogic.com/optic/expression/fn"
    at "/MarkLogic/optic/optic-fn.xqy";
(: The module required for type-casting values, e.g. xs:long, xs:dateTime :)
import module namespace oxs="http://marklogic.com/optic/expression/xs"
    at "/MarkLogic/optic/optic-xs.xqy";

Let’s limit this to those having more than 10 questions.

SELECT asker, month, count(1) as qcount
FROM (
  SELECT asker, year(qaskdate)|| '-' || right('0'||month(qaskdate), 2) as month
  FROM samplestack.qna
)
GROUP BY asker, month
HAVING qcount >= 10
ORDER BY qcount DESC
LIMIT 10
const op = require('/MarkLogic/optic');
op.fromView("samplestack", "qna")
  .groupBy(
    [
      op.col('asker')
      ,
      op.as('month', op.fn.formatDateTime(op.col('qaskdate'), '[Y0001]-[M01]'))
    ]
    ,
    op.count('qcount')
  )
.where(op.ge(op.col('qcount'), 10))
.orderBy(op.desc('qcount'))
.limit(10)
.result()
op:from-view("samplestack", "qna")
    =>op:group-by(
      (
        op:col('asker')
        ,
        op:as('month', ofn:format-dateTime(op:col('qaskdate'), '[Y0001]-[M01]'))
      )
      ,
      op:count('qcount')
    )
=>op:where(op:ge(op:col('qcount'), 10))
=>op:order-by(op:desc('qcount'))
=>op:limit(10)
=>op:result()

It is important to note that the ‘where’ clause appears after the ‘group by’ clause. The earlier clause (modify plan) will dictate the data (row set) available to the succeeding clause, i.e.:

const op = require('/MarkLogic/optic');

op.fromView("samplestack", "qna")
  .groupBy(
    [
      op.col('asker')
      ,
      op.as('month', op.fn.formatDateTime(op.col('qaskdate'), '[Y0001]-[M01]'))
    ]
    ,
    op.count('qcount')
  )
  .where(op.and(
    op.ge(op.col('qcount'), 10)
    ,
    //(. THIS NEXT LINE SHOULDN'T GO HERE .)
    op.eq(op.fn.yearFromDateTime(op.col('qaskdate')), 2012)
  ))
  .orderBy(op.desc('qcount'))
  .limit(10)
  .result()
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

op:from-view("samplestack", "qna")
  =>op:group-by(
    (
      op:col('asker')
      ,
      op:as('month', ofn:format-dateTime(op:col('qaskdate'), '[Y0001]-[M01]'))
    )
    ,
    op:count('qcount')
  )
  =>op:where(op:and((
    op:ge(op:col('qcount'), 10)
    ,
   (: THIS NEXT LINE SHOULDN'T GO HERE :)
    op:eq(ofn:year-from-dateTime(op:col('qaskdate')), 2012)
  )))
  =>op:order-by(op:desc('qcount'))
  =>op:limit(10)
  =>op:result()

The above will not work since ‘qaskdate’ is no longer part of the rowset specified by the ‘group-by’ modifier. To limit the result to a particular constraint, the query would have to be adjusted as follows:

const op = require('/MarkLogic/optic');

op.fromView("samplestack", "qna")
  //(. THIS NEXT LINE SHOULD GO HERE INSTEAD .)
  .where(op.eq(op.fn.yearFromDateTime(op.col('qaskdate')), 2012))
  .groupBy(
    [
      op.col('asker')
      ,
      op.as('month', op.fn.formatDateTime(op.col('qaskdate'), '[Y0001]-[M01]'))
    ]
    ,
    op.count('qcount')
  )
  .where(op.ge(op.col('qcount'), 10))
  .orderBy(op.desc('qcount'))
  .limit(10)
  .result()
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

op:from-view("samplestack", "qna")
   (: THIS NEXT LINE SHOULD GO HERE INSTEAD :)
  =>op:where(op:eq(ofn:year-from-dateTime(op:col('qaskdate')), 2012))
  =>op:group-by(
    (
      op:col('asker')
      ,
      op:as('month', ofn:format-dateTime(op:col('qaskdate'), '[Y0001]-[M01]'))
    )
    ,
    op:count('qcount')
  )
  =>op:where(op:ge(op:col('qcount'), 10))
  =>op:order-by(op:desc('qcount'))
  =>op:limit(10)
  =>op:result()

All the columns of ‘samplestack.qna’ are available to the initial ‘where’ plan modifier and the succeeding ‘group-by’ plan modifier.

Sub-queries

But how do I now limit the count to users that come from ‘London’?

SELECT asker, month, count(1) as qcount
FROM (
  SELECT asker, year(qaskdate)|| '-' || right('0'||month(qaskdate), 2) as month
  FROM samplestack.qna
  WHERE asker in (
    SELECT contributorusername
    FROM samplestack.contributors
    WHERE userlocation='London'
  )
)
GROUP BY asker, month
ORDER BY qcount desc
LIMIT 10

Unfortunately, such queries cannot be directly ported to Optic API. What we can do is to adjust the approach.

SELECT asker, month, count(1) as qcount
FROM (
  SELECT asker, year(qaskdate)|| '-' || right('0'||month(qaskdate), 2) as month
  FROM samplestack.qna
  LEFT JOIN samplestack.contributors 
    ON contributors.contributorusername = qna.asker
  WHERE userlocation='London'
)
GROUP BY asker, month
ORDER BY qcount desc
LIMIT 10
const op = require('/MarkLogic/optic');

op.fromView("samplestack", "qna")
  .joinInner(op.fromView("samplestack", "contributors"), null, op.eq(op.col('contributorusername'), op.col('asker')))
  .where(op.eq(op.col('userlocation'), 'London'))
  .groupBy(
    [
      op.col('asker')
      ,
      op.as('month', op.fn.formatDateTime(op.col('qaskdate'), '[Y0001]-[M01]'))
    ]
    ,
    op.count('qcount')
  )
.orderBy(op.desc('qcount'))
.limit(10)
.result()
version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
op:from-view("samplestack", "qna")
  =>op:join-inner(op:from-view("samplestack", "contributors"), (), op:eq(op:col('contributorusername'), op:col('asker')))
  =>op:where(op:eq(op:col('userlocation'), 'London'))
  =>op:group-by(
    (
      op:col('asker')
      ,
      op:as('month', ofn:format-dateTime(op:col('qaskdate'), '[Y0001]-[M01]'))
    )
    ,
    op:count('qcount')
  )
=>op:order-by(op:desc('qcount'))
=>op:limit(10)
=>op:result()

Going Beyond just SQL

One major limitation of SQL is the inability to search for a particular word across all fields. Optic API supports use of ‘cts:query’ objects to be used in the ‘where’ modifier:

const op = require('/MarkLogic/optic');

op.fromView("samplestack", "contributors")
  .where(cts.wordQuery('london'))
  .limit(10)
  .result()
version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

op:from-view("samplestack", "contributors")
  =>op:where(cts:word-query('london'))
  =>op:limit(10)
  =>op:result()

It is important to note, though, that this word search will apply to the entire document and not just the fields involved in the view.

You can also sort the results using `score` to take advantage of MarkLogic’s search engine capabilities:

'use strict';
const op = require('/MarkLogic/optic');

let docSearch = op.fromSearch(cts.jsonPropertyWordQuery('displayName', ['adam', 'alan']))
let viewSearch = op.fromView('Samplestack', 'Contributors', null, op.fragmentIdCol("$$sourceId"))
docSearch
  .joinInner(viewSearch, op.on('fragmentId', '$$sourceId'))
  .orderBy(op.desc('score'))
  .result()
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic"
        at "/MarkLogic/optic.xqy";

let $doc-search := op:from-search(cts:json-property-word-query('displayName', ('adam', 'alan')))
let $view-search := op:from-view('Samplestack', 'Contributors', (), op:fragment-id-col("$$sourceId"))
return $doc-search
  => op:join-inner($view-search, op:on('fragmentId', '$$sourceId'))
  => op:order-by(op:desc('score'))
  => op:result()

This particular example allows us to retrieve the `contributor` row for a user whose displayName contains the word ‘adam’ or ‘alan’. Note that this displayName property is not part of the view, but part of the source document. Notice that the result now includes a `score` column:

Example with Score Column

Information about the source document can also be included as part each row:

const op = require('/MarkLogic/optic');

op.fromView("samplestack", "contributors", "", op.fragmentIdCol("$$sourceId"))
  .joinDocUri('source_uri', op.fragmentIdCol('$$sourceId'))
  .joinDoc('source_doc', op.fragmentIdCol('$$sourceId'))
  .where(op.eq(true, op.fn.matches(op.xpath('source_doc', '//aboutMe'),'man')))
  .select([
    'source_uri'
    ,
    'source_doc'
    ,
    op.as('displayName', op.xpath('source_doc', '//displayName'))
    ,
    op.as('aboutMe', op.xpath('source_doc', '//aboutMe'))
  ])
  .limit(10)
  .result()
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

op:from-view("samplestack", "contributors", "", op:fragment-id-col("$$sourceId"))
  =>op:join-doc-uri('source_uri', op:fragment-id-col('$$sourceId'))
  =>op:join-doc('source_doc', op:fragment-id-col('$$sourceId'))
  =>op:where(op:eq(true(), ofn:matches(op:xpath('source_doc', '//aboutMe'),'man')))
  =>op:select((
    'source_uri'
    ,
    'source_doc'
    ,
    op:as('displayName', op:xpath('source_doc', '//displayName'))
    ,
    op:as('aboutMe', op:xpath('source_doc', '//aboutMe'))
  ))
  =>op:limit(10)
  =>op:result()

`$$sourceId` is just a column name to reference the fragmentId of each row found. This fragmentId can then be referenced and used as a column to retrieve the source fragment. Additionally, this fragmentId can also be used to join the active row with other rows coming from the same fragment. By default, this means document. So if a single document results in multiple rows, e.g. `answers` in our current test data, then those rows are likely to return the same fragmentId. More information about fragments is available in the Fragments chapter of the Administrator’s Guide.

Optic API is not limited to views. We can also make use of existing indexes. Referencing the data we have in this article, we can now produce the same result using with and without the Optic API.

XQuery without Optic API:

let result = [];

let tuples = cts.valueTuples(
  [
    cts.elementReference(xs.QName('age'))
    ,
    cts.elementReference(xs.QName('gender'))
  ],
  ["item-frequency", "frequency-order"]
)
for (let tuple of tuples){
  let query = cts.andQuery([
    cts.elementValueQuery(
      xs.QName('age')
      , 
      xs.string(tuple[0])
    )
    ,
    cts.elementValueQuery(
      xs.QName('gender')
      , 
      tuple[1]
    )
  ])
  let aggregate = cts.avgAggregate(
      cts.elementReference(xs.QName( 'height'))
      , null, query
    )
  result.push(fn.stringJoin([
    xs.string(tuple[0]),
    tuple[1],
    aggregate != null ? xs.string(aggregate) : null
  ], '-'))
}
result
xquery version "1.0-ml";

for $tuple in cts:value-tuples(
  (
    cts:element-reference(xs:QName('age'))
    ,
    cts:element-reference(xs:QName('gender'))
  ),
  ("item-frequency", "frequency-order"))
let $query  := cts:and-query((
  cts:element-value-query(
    xs:QName('age')
    , 
    xs:string($tuple[1])
  )
  ,
  cts:element-value-query(
    xs:QName('gender')
    , 
    xs:string($tuple[2])
  )
))
return $tuple[1] 
  || " - " || $tuple[2] 
  || " - " || cts:avg-aggregate(
    cts:element-reference(xs:QName( 'height'))
    , (), $query
  )

Using Optic API:

const op = require('/MarkLogic/optic');

op.fromLexicons(
    {
      "age": cts.elementReference(xs.QName("age")),
      "gender": cts.elementReference(xs.QName("gender")),
      "height": cts.elementReference(xs.QName("height"))
    }
    ,
    "person"
  )
  .groupBy(["age", "gender"], op.avg('averageHeight', op.col('height')))
  .limit(10)
  .result() 
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

op:from-lexicons(
  map:new((
    map:entry("age", cts:element-reference(xs:QName("age")))
    ,
    map:entry("gender", cts:element-reference(xs:QName("gender")))
    ,
    map:entry("height", cts:element-reference(xs:QName("height")))
  ))
  ,
  "person"
)
  =>op:group-by(("age", "gender"), op:avg('averageHeight', op:col('height')))
  =>op:limit(10)
  =>op:result()

Your final results should look something like this:

Final Optic API results displaying name and eye color

Optic API can also be used against literals and triples.

Learn More

Optic API Resources

Explore all technical resources related to the Optic API and how it can be used in MarkLogic.

Getting Started with Optic

Get started with the Optic API with thoroughly explained sample queries from the most basic to multi-model.

Search Multi-Model Data

Pull data of different shapes into a MarkLogic Data Hub and use the Optic API to view and query the data.

Learn More

Optic API Resources

Explore all technical resources related to the Optic API and how it can be used in MarkLogic.

Getting Started with Optic

Learn about the Optic API, how it works and when to use it. See samples queries explained, from basic to more advanced.

Search Multi-Model Data

Pull data of different shapes into a MarkLogic Data Hub and use the Optic API to view and query the data.

This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.