Problem

You want to do small-scale TDE extraction and SPARQL testing or development using Query Console and files on disk.

Solution

Applies to MarkLogic versions 9+

1. Validate a template from disk:

Applies to MarkLogic versions 9+

2. Extract triple data from a document sub.xml using template sub-tde.xml:

let $tde := xdmp:document-get ('/Users/chamlin/tmp/sub-tde.xml')
let $xml := xdmp:document-get ('/Users/chamlin/tmp/sub.xml')
return tde:node-data-extract ($xml, $tde)

Notice that the output returns JSON data, with the triples keyed by the document you passed in:

{
"/Users/chamlin/tmp/sub.xml": [
  {
    "triple": {
      "subject": "https://marklogic.com/support/subscription#111", 
      "predicate": "https://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
      "object": "https://marklogic.com/support/subscription"
    }
  }, 
... etc. ...

3. Extract triple data from a document sub.xml using template sub-tde.xml and return triples:

let $tde := xdmp:document-get ('/Users/chamlin/tmp/sub-tde.xml')
let $xml := xdmp:document-get ('/Users/chamlin/tmp/sub.xml')
let $triples := tde:node-data-extract ($xml, $tde)=>map:get ('/Users/chamlin/tmp/sub.xml')=>json:array-values()
return $triples

4. Generalize to extract from multiple documents using multiple templates and return triples:

xquery version "1.0-ml";
let $doc-uris := ('/Users/chamlin/tmp/sub.xml', '/Users/chamlin/tmp/sub2.xml')
let $docs := $doc-uris ! xdmp:document-get (.)
let $tde-uris := ('/Users/chamlin/tmp/sub-tde.xml', '/Users/chamlin/tmp/sub-tde2.xml')
let $tdes := $tde-uris ! xdmp:document-get (.)
let $triples :=
    let $extract := tde:node-data-extract ($docs, $tdes)
    for $key in map:keys ($extract)
    return map:get ($extract, $key)=>json:array-values()
return $triples

5. Using the extracted triples in an in-memory data store, run test queries in SPARQL:

let $tde := xdmp:document-get ('/Users/chamlin/tmp/sub-tde.xml')
let $xml := xdmp:document-get ('/Users/chamlin/tmp/sub.xml')
let $triples := tde:node-data-extract ($xml, $tde)=>map:get ('/Users/chamlin/tmp/sub.xml')=>json:array-values()
let $sparqlall := "select ?s ?p ?o where {?s ?p ?o}"
return 
    sem:sparql($sparqlall, (), (), sem:in-memory-store($triples))

Required Privileges:

  • https://marklogic.com/xdmp/privileges/qconsole
  • https://marklogic.com/xdmp/privileges/xdmp-document-get
  • https://marklogic.com/xdmp/privileges/sem-sparql

Discussion

Using Query Console and a few on-disk documents makes small-scale testing and development of TDE extraction and SPARQL queries as quick as edit, save, and run.

  • Example 1 shows validation of an on-disk TDE template.
  • Example 2 shows extraction of triple data from from a document on disk. The result is returned in a JSON format with the triples keyed by the document you passed in.
  • Example 3 shows how to convert the returned JSON to triples. map:get is used with the document key to get the triples. In query console the will be displayed in Turtle format, which you may find more compact and readable than the JSON.
  • Example 4 shows how to extract triples from more than one document with more than one template. The triples need to be extracted from each document key.
  • Example 5 shows how to use the extracted triples in an in-memory-store. You can then use SPARQL to query the extracted triples and return a solution.

Note: To use SPARQL features, a license that includes the Semantics Option is required.

Learn More

TDE Documentation

Explore the documentation on how to get started with Template Driven Extraction in the Application Developer’s Guide.

TDE Technical Resources

Explore all the resources related to Template Driven Extraction (TDE), and learn more about how it can be used in MarkLogic.

MarkLogic Data Integration

Use your development skills to integrate data from silos to create an operational data hub in this hands-on course.

This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.