Controlling output options

by Evan Lenz

MarkLogic natively supports both XQuery and XSLT, and both languages use the same (XPath) data model. In the XPath data model, XML is represented as an abstract tree of nodes. The abstract nature of the model means that certain details about XML are not included, such as a document's encoding, its DOCTYPE declarations, whether or not you use quotes or apostrophes for your attributes, etc. But when it comes time to output the result of your program (XML, or more often, HTML), the resulting tree needs to be serialized. And all those gritty details about how to represent the XPath tree as a stream of bytes need to be resolved. How do you control (at least some of) these details?

In XSLT, you have some control right at the language level (assuming your processor is the one responsible for serializing the result, as MarkLogic is). You can use the <xsl:output> element to control the output. For example, the following output declaration tells the XSLT processor to output its result in ASCII encoding, with extra indentation for better readability, using the HTML "output method" (e.g., so br elements appear as <br> instead of <br/> or <br></br>).

<xsl:output encoding="us-ascii" method="html" indent="yes"/>

The XSLT 2.0 spec lists the full range of output options. But what about XQuery? As it turns out, even though they use the same data model and serialization concerns are equally well-defined for both languages, XQuery doesn't include built-in language support for controlling these options. Fortunately though, XQuery provides a generic extension mechanism for declaring processor-specific options, and even more fortunately, MarkLogic provides the exact options you need. Here's how you'd make the same explicit determination in your XQuery code:

declare option xdmp:output "method=html";
declare option xdmp:output "encoding=us-ascii";
declare option xdmp:output "indent=yes";

Oftentimes, the default output options will serve you just fine. For example, XSLT, unless you specify otherwise, will automatically use the HTML output method when the document element of the result is <html> or <HTML>.

There's another way you can control the output options in MarkLogic without having to make code edits (and this feature is new in MarkLogic 5). You can define the defaults at the app server level, as shown below.

Machine generated alternative text: L J[____ • Configure Help 1 Appserver: I 23-http-server output options - - Serialization parameters. output sgml character entitles output encoding output method output byte order mark output cdata section namespaco un output cdata section local name Output Options Configuration ok none :; Output SGML character entities. UTF-8 The default output encŒling default  Output method. default : The output sequence of octets is to be preceded by a Byte Order Mark. Narnespace URI of the cdata section bcalnarnC specified below. Eiernent localnanie or list of element localnarnes to be output as CDATA sectbn& cancel output doctype public A public identifier to use on the emitted DOCTYPE. output doctype system A system identifier to use on the emitted DOCTYPE.

You can find this screen in the left-hand menu for your app server:

Machine generated alternative text: Configure ËI Groups j Default ! iThHosts I Î Étì App Servers 123.http.sorvor [HTTPJ ÆE Narnespaces Schenias E13 Request Blackouts , I Output Options

Thus concludes this week's random tip!

Comments

  • I think 'fine this screen'should be 'find this screen'