[MarkLogic Dev General] sorting efficiency question

Jason Hunter jhunter at marklogic.com
Mon Dec 18 12:23:53 PST 2006


Charles Blair wrote:
> hello all. i need to do a 3-level sort, somewhat as follows:
> 
> let $docs := for $i in fn:distinct-values($results//titleName),
>                   $j in fn:distinct-values($results//titleTypeName),
>                    $k in fn:distinct-values($results//titlePhotoSequenceNumber)
>              let $p := $results[//titleName = $i and
>                                 //titleTypeName = $j and
>                                 //titlePhotoSequenceNumber = $k]
>              order by $i, $j, $k
>              return $p
> 
> the above works, and is easy enough to write, but it's not terribly
> speedy at returning results. does someone have, or can someone
> suggest, a better approach? 


Hi Charles,

I suspect your slowness isn't due to the sorting as much as it is all 
the distinct-values() calls and the fact you're not using any searchable 
XPath expressions.  I expect that by using lexicons, a new feature in 
3.1, and some searchable XPaths you could gain a terrific speed boost.

In more detail, I'd use lexicons to quickly get the distinct title 
features, then use searchable XPaths to get the matching results for 
each cartesian product.  Or you may be able to pass a cts:query to each 
lexicon call, letting you loop over titles and limit the titleTypeName 
values returned to each titleName, but this is only possible if your 
schema is such that there's just one of each of these elements per 
fragment.  The looping approach will be better if the number of distinct 
names, type names, and photo sequence numbers are large meaning the 
cartesian product would be a*b*c and excessive.

Feel free to contact us for some consulting if you need help with any of 
this.

-jh-




More information about the General mailing list