Solutions

MarkLogic Data Hub Service

Fast data integration + improved data governance and security, with no infrastructure to buy or manage.

Learn More

Learn

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Community

MarkLogic World 2019

Learn how to simplify data integration & build innovative applications. Join us in Washington D.C. May 14-15!

Find Out More

Company

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Drilling in with XPath

Sometimes you don't want to fetch whole messages, just parts of them, and in those cases you can use XPath to specify what part of a message you want. The following query gets the first email and returns its subject element:

This does the same for the first ten mails:

This returns the subjects as strings instead of XML elements, by executing the string() function on each subject:

This returns the first (random) ten paragraphs that contain URLs:

The double slash means any depth under the parent is fine. The [url] predicate says the <para> element has to have a <url> child.

Why are we using parentheses so often? It's good practice when extracting a subset of items from a sequence. In XPath, the following query doesn't say to get one paragraph, it says to get all first paragraphs. It will return about 5,000,000 paragraphs, the first paragraphs from all emails, and take a very long time to execute (and yes, smiley faces are how you surround comments in XQuery):

That's powerful, but when you want just one paragraph, you use parentheses. The following query returns the first item across all paragraphs. It executes close to instantly.

Looking at a Mail Message

Formatting Results

Stack Overflow iconStack Overflow: Get the most useful answers to questions from the MarkLogic community, or ask your own question.