[XQZone General] passing sequences to xdmp:eval( )
Howard Katz
howardk at fatdog.com
Wed Dec 7 09:17:45 PST 2005
I have a series of boolean-returning query tests that are stored as strings,
and I need to be able to apply these tests against a variety of node
sequences, getting back the nodes that passed. Since the queries are
strings, I'm using xdmp:eval(). This works ok, but I have to do a slight
workaround to satisfy eval(), and I'm concerned about performance issues
when I scale up to run the tests against very large sequences, say on the
order of hundreds of thousands of nodes.
Assume that one of the stored tests I want to run is 'starts-with( ., "ca"
)'. If I was directly applying this test against the sequence, "(<a>cat</a>,
<b>dog</b>, <c>catalog</c>)" (ie, not eval'ing it), I could say:
let $seq := ( <a>cat</a>, <b>dog</b>, <c>catalog</c> )
return
$seq[ starts-with( ., "ca" ) ]
and I'd get back the node sequence, ( <a>cat</a>, <c>catalog</c> ). So far
so good.
If the same query is now stored as a string and I'm using eval(), I'd
similarly like to be able to say:
define function eval-node-test( $nodes as element()+, $test as xs:string
) as element()*
{
let $query := concat( "define variable $seq as element() external
", "$seq[ ", $test, " ]" )
return
xdmp:eval( $query, ( xs:QName("seq"), $nodes ) )
}
let $nodes:= ( <a>cat</a>, <b>dog</b>, <c>catalog</c> )
return
eval-node-test( $nodes, "starts-with( ., 'ca' )" )
This won't work however because the $nodes argument passed in '(
xs:QName("seq"), $nodes )' can't be a sequence, only a singleton. This means
that in order to use eval(),
(1) I have to wrap the $nodes sequence in a temporary <temp-root/> wrapper,
and
(2) construct the last part of the query inside concat() as "$seq/* [ ",
$test, ... ", rather than "$seq [ ", $test ... ".
In other words,
1) To satisfy eval(), I have to hoist all my nodes into a temporarily
constructed super-element, and
2) dereference every one of them again inside my query
Since I potentially need to be able to run these tests against hundreds of
thousands of nodes, I'm concerned about performance. Is that concern
justified? And if is, is there a more efficient way of doing this?
Howard
More information about the General
mailing list