Using The MarkLogic XQuery JSP Tag Library

Note: JSP Tag library was built and this tutorial was authored in 2004 before support for XCC.

This tutorial will show you how to use the MarkLogic Java Server Pages (JSP) Tag Library to access MarkLogic from within a J2EE web application. The MarkLogic tag library will work with any JSP container such as Tomcat or Jetty, as well as EJB application servers like WebLogic, WebSphere or JBoss.

The following assumes that you are familiar with Java, Java Servlets and JSP, and that you understand how to build and deploy J2EE web applications.

Introduction

While MarkLogic Server is unique in many ways, the role it plays in the IT ecosystem is typically that of datasource, a generic name for a system component that stores and retrieves data.

MarkLogic Server includes a builtin HTTP server which is sufficient to execute XQuery scripts and output HTML directly to the user’s browser. But there may be cases where it’s preferable to access content from within a web application that’s running in a J2EE container. Reasons for this could include combining the result with data from other datasources, further processing the result to apply business rules written in Java, generating XQuery scripts dynamically at runtime from session-specific information, etc.

MarkLogic Server comes bundled with a Java connectivity package called XDBC that is modeled after the standard JDBC package. It provides a connection-oriented mechanism by which Java code can issue queries to a server and receive the results over the network.

The JSP Tag Library is built on top of XDBC and simplifies J2EE integration by interfacing to a MarkLogic Server using only standard JSP tag syntax.. It’s open source and available free of charge from the developer network at https://developer.marklogic.com. The code is made available under the Apache 2.0 license.

Custom JSP Tag Libraries

The MarkLogic JSP tags are modeled after the Java Standard Tag Library (JSTL) both to respect the prevailing conventions and to make it easier to inter-operate with the JSTL tags. The tags in the SQL component of the JSTL were the direct inspiration for the Mark Logic tags. They are similar in design philosophy but vary in significant ways.

To get started, let’s look at how custom JSP tags are hooked into a web application and then we’ll look at some usage examples.

A custom JSP tag library consists of a set of Java classes, usually packaged into a Java Archive (JAR) file, and a descriptor file that tells the JSP container how those classes map to individual tags in a JSP page. The Java classes are written against the appropriate J2EE tag APIs and are the code that is invoked when those custom tags are processed by the JSP container.

The container maps tags encountered in a JSP to specific classes by means of a Tag Library Definition (TLD) file. The TLD contains information about the tag syntax, such as which attributes are allowed, and the names of the implementing Java classes.

The TLD for a custom tag library (there may be multiple tag libraries in use at once) is named in a web application’s web.xml file which is the overall descriptor for a web application.

Working backwards now, you need to add a clause like this to your project’s web.xml, so the container can find the TLD:

<taglib>
    <taglib-uri>https://marklogic.com/jsp/taglib</taglib-uri>
    <taglib-location>/WEB-INF/taglib/marklogicxquery.tld</taglib-location>
</taglib>

This defines a URI which identifies the TLD to the container when that URI is referenced in a JSP. This URI is arbitrary — it need not be a real web address — it’s only used as a unique identifier string. The second item is the path, relative to the web application, of the actual TLD descriptor file.

Place the TLD (included in the MarkLogic JSP distribution) under the WEB-INF directory at the indicated path. It can go anywhere under the web application’s root as long as it’s where the taglib-location element declares it to be.

Next, make sure the JAR file with the implementation classes is in the WEB-INF/lib directory so that the container can find them at run time.

Finally, use the custom tags in a JSP. In order for a JSP to make use of a custom tag library, it needs to declare that wants to use it. This is done at the top of the JSP with a taglib declaration. Like this: <%@ taglib uri="https://marklogic.com/jsp/taglib" prefix="xq" %>

The uri attribute must match the name in the taglib definition in the web.xml file. The prefix can be anything you like, it defines the XML name space that will identify this tag library within the scope of the JSP page. The xq prefix will be used throughout this tutorial, but you can pick something else if you prefer.

Here is a complete JSP example:

<%@ taglib uri="https://marklogic.com/jsp/taglib" prefix="xq" %>

<xq:setDataSource host="localhost" port="8003"
   user="someuser" password="secret"/>

<html><head><title>xq:query example 1</title></head>
<body>
My favorite fruits:
<ul>
<xq:execute>
    <xq:query>
        for $i in ("apple", "pear", "orange", "guava")
          return <li>{ $i }</li>
    </xq:query>
</xq:execute>
</ul>
</body>
</html>

This JSP outputs an HTML page with a bullet list that looks like this:

My favorite fruits:

apple
pear
orange
guava

We’ll use this example through the rest of this tutorial, showing how to achieve the same result using different approaches.

The XQ Tags

Let’s take a look now at the specifics of the MarkLogic XQuery JSP tag library, which I’ll refer to from here on as the XQ tags.

Because a network connection is used to communicate with the server, it’s possible for arbitrarily large results to be returned. The tags in the XQ tag library operate in either streaming or buffered modes depending on the attributes specified in an xq:execute or xq:result tag (more on those shortly).

The result of an XQuery execution is a sequence of zero or more result items, each of which may be any of the types defined by the XQuery spec. When the result is streaming, the items must be accessed sequentially and are only available one at a time.

In buffered mode, the full result is read from the XDBC connection and stored in a named container variable. There is no further dependence on the connection, so buffered results may be retained in the container as long as needed and forwarded to other JSPs or to servlets as desired. The items in the result may also be accessed randomly and/or repeatedly.

Getting Connected

Let’s first look at how to establish a connection to a MarkLogic Server instance. The XQ tags use the concept of a datasource from which connections are obtained. There are two XQ tags related to datasources: xq:setDataSource andxq:unSetDataSource.

The xq:setDataSource tag is given the information needed to establish a server connection and stores an opaque object that will be used later by the xq:execute tag. If the name of a container variable to set is not provided, an internal default is used. The scope is page by default. Some examples:

<xq:setDataSource host="localhost" port="8003" user="joe" password="hush"/>
<xq:setDataSource var="myconn" host="... />
<xq:setDataSource scope="session" host="... />
<xq:setDataSource dataSource="techpubsdb" scope="request"/>
<xq:setDataSource>
    <xq:host>localhost</xq:host>
    <xq:user configParameter="cisusername"/>
</xq:setDataSource>

The first example above is the typical way of setting up an XDBC datasource with hard-coded values, using the default variable name and scope to store the datasource object. The second example gives an explicit variable name to use while the third uses the default name but stores the datasource object in session scope.

The fourth example gives the name of a pre-existing datasource (a JNDI lookup key) that should be used rather than creating a new one. Using named datasources allows the connection information to be set externally rather than in the JSP body. The drawback is that a JNDI-compliant naming service must be available at runtime.

The xq:setDataSource can also accept nested tags to specify the connection information. Each nested tag has the same name as the corresponding attribute. These tags take two forms: parameter value as the body of the tag, or an attribute naming a web application parameter (defined in web.xml) whose value should be used. This is an alternate way of externalizing connection information without requiring a JNDI service.

<xq:unSetDataSource/>
<xq:unSetDataSource var="myconn" scope="session" />

The xq:unSetDataSource tag is used to clear a variable containing an XDBC datasource object. Standard tags in the JSTL can be used to do this, if you know the name of the variable. This tag is provided so that you can clear a datasource that was set to the default internal name (which is intentionally kept hidden).

If datasource objects are stored in page or request scope, it’s usually not necessary to clear them, since the variable will soon fall out of scope anyway. But a longer-lived datasource may be holding a connection pool, so it’s a good idea clear it when you’re done if it’s in session or application scope.

Naming datasources should only be necessary if there is a chance that more than one will be active at once. In most cases you can use the default internal name, which means you won’t need to name a datasource on the xq:execute tag.

Executing Queries

Now that we know how to specify where the server is, let’s look at how to talk to it. The xq:execute tag encapsulates sending a query to the server and receiving the result. Here are a couple of simple examples:

<xq:execute var="result">
    <xq:query>"Hello World"</xq:query>
</xq:execute>

<xq:execute>
    <xq:query>"Hello World"</xq:query>
</xq:execute>

The first example, where the var attribute is provided, is buffered. When the tag finishes execution, the variable result will contain a Java object of type Result (more on that later) which holds the result of the query. No output will be generated by the xq:execute tag, so it effectively disappears from the page.

The second example streams its output (replacing the xq:execute tag in the page with the result of the query) because no result variable name was given.

If you want to do more with the query result than dump it to the output, an xq:result tag may be nested within xq:execute (it must follow xq:query). But the xq:result tag may only be used when streaming. It is an error to nest an xq:result tag if the enclosing xq:execute tag also has a var attribute (which implies buffered mode).

Using the xq:result tag allows you to decorate the result data with other markup. It’s a looping tag whose body is executed once for each item in the result. In order for the xq:result tag to know where the result data should go, you can use the xq:streamItem to emit a string representation of the current result item (there are other ways, which we’ll look at later).

That’s a lot of description. Here’s an example to show how it works:

<%@ taglib uri="https://marklogic.com/jsp/taglib" prefix="xq" %>

<xq:setDataSource host="localhost" port="8003"/>

<html><head><title>xq:query example 2</title></head>
<body>
My favorite fruits:
<ul>
<xq:execute>
    <xq:query>
        ("apple", "pear", "orange", "guava")
    </xq:query>
    <xq:result>
       <li><xq:streamItem/></li>
    </xq:result>
</xq:execute>
</ul>
</body>
</html>

This is a revision of the fruity example above. But in this one, the HTML markup tags have been removed from the XQuery code so that it now returns a simple sequence of strings. Each result item in the sequence is nested in an <li> tag in the JSP rather than mixing them into the XQuery script and having the server generate them. The resulting page is identical, but the data and presentation markup have been decoupled.

Playing Well With Others

Both the xq:execute tag and the xq:result tag accept a var attribute. This attribute names a variable to be set as a part of query processing. The XQ tag library defines two Java interfaces that are used by these tags. They are Result(set by xq:execute) and ResultItem (set by xq:result). Both are in the com.marklogic.jsptaglib.xquery.common package:

public interface Result
{
    int getSize();
    ResultItem [] getItems();
    ResultItem getItem (int index);
}

public interface ResultItem
{
    int getIndex();
    boolean isNode();
    Object getObject();
    String getString();
    org.w3c.dom.Document getW3cDom() throws XDBCException;
    org.jdom.Document getJDom() throws XDBCException;
    Reader getReader();
}

Let’s look at an example of using the var attribute with xq:execute. This is “fully buffered mode” where the entire query result is read and buffered in memory before control returns to the JSP and an instance of Result is stored in the named container variable:

<%@ taglib uri="https://marklogic.com/jsp/taglib" prefix="xq" %>
<%@ taglib uri="https://sun.com/jstl/c" prefix="c" %>

<xq:setDataSource host="localhost" port="8003"/>
<xq:execute var="result">
    <xq:query>
        ("apple", "pear", "orange", "guava")
    </xq:query>
</xq:execute>

<html><head><title>xq:query example 3</title></head>
<body>
My favorite fruits:
<ul>
    <c:forEach var="item" items="${result.items}">
        <li><c:out value="${item.string}"/></li>
    </c:forEach>
</ul>
</body>
</html>

This example makes use of the JSTL control tags to access the Result object buffering the query result. The execution of the query has also been moved to the top of the JSP so that it occurs before any other markup is evaluated. The result variable is accessed later by the c:forEach tag.

Using this processing model, the buffered result could as easily have been stored in a request scope variable and passed on to another JSP or a servlet for formatting. Decoupling query execution from presentation markup better facilitates Model-View-Controller architectures.

The ResultItem interface provides several useful accessor methods that can be called from Java code to retrieve the data in various ways. Those accessors also constitute bean properties that can be referenced by JSP 2.0 Expression Language constructs as in the example code above.

Semi-Buffered

When the xq:execute tag is buffered (var is specified) a nested xq:result tag may not be used. This is because the xq:result tag iterates over the each item in the result stream. When xq:execute is buffered, the items have already been read and stored in a Result object instance.

However, it’s often useful to access result items through the ResultItem API from within the body of an xq:result tag. If you specify a var attribute on an xq:result tag, then on each iteration of the tag that variable will be set to a ResultItem instance that holds the content of the current result item. In this case each item in the result stream is buffered, one at a time, for the duration of that iteration, then discarded.

Here’s the fruit example using the semi-buffered idiom:

<%@ taglib uri="https://marklogic.com/jsp/taglib" prefix="xq" %>
<%@ taglib uri="https://sun.com/jstl/c" prefix="c" %>

<xq:setDataSource host="localhost" port="8003"/>

<html><head><title>xq:query example 4</title></head>
<body>
My favorite fruits:
<ul>
<xq:execute>
    <xq:query>
        ("apple", "pear", "orange", "guava")
    </xq:query>
    <xq:result var="item">
        <li><c:out value="${item.string}"/></li>
    </xq:result>
</xq:execute>
</ul>
</body>
</html>

Here you can see that the body of the xq:result tag is identical to that of the c:forEach in the previous example. Both are looping tags and both use the ResultItem interface in the same way. But because the ResultItem is short-lived when semi-buffering, we’ve had to return to the pattern of doing query execution inline rather than before the main page evaluation.

Future Enhancements

The XQ tag library also includes a tag that we haven’t talked about: xq:variable. This tag will be used to pass parameters as XQuery external variables. As of this writing MarkLogic Server does not yet support this capability but it is planned for a future release.

There will also be support in the xq:execute tag to name a server-resident module, similar to a stored procedure, rather than passing a query script from the client to the server.

As soon as these features become available in the server, the XQ tags will be updated to make use of them.

Written Tutorial