You have used TDE to extract data from documents into the row index, but there are additional pieces of information you want to return with a particular query.
Applies to MarkLogic versions 9+
Consider a schema called
content with a view called
books. The view has columns
price. The documents also have an element called
summary that you decided not to put in the view, as it won’t be used very often. For this query, however, you want to include it in the results.
const op = require('/MarkLogic/optic'); const docId = op.fragmentIdCol('docId'); op.fromView('content', 'books', null, docId) .where(op.gt(op.col('price'), 20)) .joinDoc('doc', docId) .select([ 'title', 'author', 'price', op.as('summary', op.xpath(op.col('doc'), '/doc/summary')) ]) .result();
import module namespace op="https://marklogic.com/optic" at "/MarkLogic/optic.xqy"; let $doc-id := op:fragment-id-col('docId') return op:from-view('content', 'books', (), $doc-id) => op:where(op:gt(op:col('price'), 20)) => op:join-doc('doc', $doc-id) => op:select(( 'title', 'author', 'price', op:as('summary', op:xpath(op:col('doc'), '/doc/summary/fn:string()')) )) => op:result()
The Optic API is useful for executing relational operations on values; however, sometimes we need to pull in additional data from the source documents. It makes sense to pull scalar values into the row index, where we can do calculations and aggregations on them, but content with substructure, such as XML with markup, doesn’t really benefit. The example presented here is relatively simple, but supplementing the results with content from documents can be done with much more complex queries, involving joins, aggregates, or other operations.
The op.fromView call identifies the schema and view that we’ll draw data from. We also use op.fragmentIdCol to let Optic know that we want to work with the source documents. In this case, I’ve used the name “docId”, but the name isn’t meaningful as we won’t be using it once we get our results.
Before we actually join the documents to our rows, we should filter down to just those rows we need, in order to avoid reading more documents from disk than is necessary. In the recipe above, only books with prices greater than 20 are included.
Having told Optic what data to work with, .joinDoc tells Optic to use the
docId to include the document content in a column called
Finally, the select specifies what columns to return in the result set. The as clause tells Optic to make a
summary column based on an XPath statement, run against the document associated with the current row.
When using this technique, we’ll see columns from the row index, as well as the specified additional data from the documents, in this case the summary. It’s important to remember that to get the summary, MarkLogic had to load the entire document, then use XPath to select just a part of it. This is similar to any other search where we return just part of search result. To ensure this performs well, use a where clause to reduce the number of documents that need to be retrieved. Consider using paging if you still get a lot of results. If the documents are large, considering adding the additional data to the row index instead of retrieving it separately.