[MarkLogic Dev General] using a collation for sorting volume numbers

Alan Darnell alan.darnell at utoronto.ca
Mon Jun 4 15:59:05 PDT 2007


James,

You're right -- the data is mixed.  For example, you might get s2 or  
pt2 for an issue or a roman numeral for a volume.  The cast fails on  
these but the collation based sort seems to treat them as text and  
keeps going.

Alan


On 4-Jun-07, at 6:52 PM, James A. Robinson wrote:

>> I have a set of documents that have volume and issue numbers
>> represented as strings (e.g. 1, 10, 2, 23).  If I use the following
>> collation, I get a nice numeric sort without having to pad or
>> otherwise normalize the data, but performance suffers.  Is there an
>> index I can add that would speed up the kind of sorting that this
>> collation needs to do?
>>
>> for $i in cts:search(doc()//article-title,"water")
>> order by $i/root()//volume collation "http://marklogic.com/collation/
>> en/MO"
>> return
>> $i/root()//volume
>
> I was curious re the use of collataion -- are your values actually  
> numeric
> values? If so  I'd have assumed a way to do this sort of sorting would
> be something like
>
>   for $d in $set
>   order by xs:decimal($d/volume), xs:decimal($d/issue)
>   return ...
>
> and so forth.  Is this naive?  I understand why it wouldn't work for
> things like '10a' or something, but for pure numbers, even if they are
> padded with leading zeros, I had assumed a cast would work.
>
> Jim
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> James A. Robinson                       jim.robinson at stanford.edu
> Stanford University HighWire Press      http://highwire.stanford.edu/
> +1 650 7237294 (Work)                   +1 650 7259335 (Fax)
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general



More information about the General mailing list