Punctuation in XPath, part 3: "@" and ".."

by Evan Lenz

We've seen what "." means (part 1) and what "/" means (part 2). And we've seen that we can use any expression (parenthesizing it as necessary) as a step in a path expression. In this third installment on XPath punctuation, we'll learn about the most common type of step in a path expression, the "axis step" and how it can be abbreviated.

All path expression steps can be categorized into two kinds:

  • filter expressions, and
  • axis steps.

A filter expression is a primary expression (variable reference, literal, context item expression, function call, or parenthesized expression) followed by zero or more predicates. An axis step is a step that selects nodes along a specific axis (also followed by zero or more predicates). This is best learned by example.

Here's an example document to work from, using XQuery to bind it to a variable:

declare variable $doc := document {
  <doc>
    <items>
      <meta id="firstGroup"/>
      <item>1</item>
      <item>2</item>
    </items>
    <items>
      <meta id="secondGroup"/>
      <item>3</item>
      <item>4</item>
    </items>
  </doc>
};
 
declare variable $items := $doc/doc/items/item;

Consider the following four-step expression. This selects the "id" attribute of each <meta> child of the parent of the first node in $items:

$items[1]/../meta/@id

The first step "$items[1]" is a filter expression. The remaining three steps are all axis steps. In fact, all three make use of some XPath syntax sugar. The following expanded expression is what the above expression is short for:

$items[1]/parent::node()/child::meta/attribute::id

We see three different axes in use here: "parent", "child", and "attribute". It's never necessary to use "attribute::" or "child::" explicitly, since "@" is always short for "attribute::" and since "child::" is the default when you omit the axis specifier (except when using the lesser-used attribute() or schema-attribute() node tests, in which case "attribute::" is the default—either way, you can always use the short form). Most often, you won't explicitly use "parent::" either, however it's sometimes useful to test the parent's name or node type, using an expression like "parent::foo" or "parent::document-node()".

Short detour: axis step evaluation

There are in fact 13 XPath axes altogether, and you can use any of them in their expanded form:

  • self::
  • child::
  • attribute::
  • namespace:: ("deprecated" in XPath 2.0 and disallowed in XQuery)
  • descendant::
  • descendant-or-self::
  • following::
  • following-sibling::
  • parent::
  • ancestor::
  • ancestor-or-self::
  • preceding::
  • preceding-sibling::

I won't go into what sets of nodes each axis selects (you might try experimenting with this handy online XPath axes visualizer to learn some by example), but suffice it to say that the evaluation of an axis step conceptually occurs in three stages, corresponding to the three parts of an axis step:

  1. Axis specifier (one of the above 13 axes or an abbreviation)
  2. Node test (either a name or kind test, such as foo, *, or node())
  3. Predicate (zero or more of them)

The three parts in the axis step of the following expression are "descendant::", "item", and "[4]":

$doc/descendant::item[4]

$doc provides the context node, and from there, the axis step is evaluated (at least conceptually) as follows:

  1. Get all nodes along the axis (descendant)
  2. Of those, filter out the ones not matching the node test (item)
  3. Of those, filter out the ones not matching the predicate ([4])

Using XQuery, you could simulate this conceptual process by explicitly breaking the evaluation down into three stages of filtering. The following expression is a longwinded way of selecting the same nodes, just to make my point:

let $matches-axis      := $doc/descendant::node(),  (: everything on axis :)
    $matches-node-test := $matches-axis/self::item, (: filtered :)
    $matches-predicate := $matches-node-test[4]     (: filtered again :)
return $matches-predicate

Summary

Now that you have a general idea of what an axis step is, here are all the axis-related syntax shortcuts in XPath:

This:

is short for this:

[notes or exceptions]

@ attribute::
.. parent::node()
foo child::foo

where "foo" is any node test except attribute(...) or schema-attribute(...)

attribute(foo) attribute::attribute(foo)

where "attribute(foo)" is any attribute(...) or schema-attribute(...) node test

// /descendant-or-self::node()/

That last one ("//") has its own quirks and gotchas. I'll be covering it in another article in this series, so stay tuned.

Comments