[MarkLogic Dev General] Sorting on aggregation in ML4.2-7

Danny Sokolsky Danny.Sokolsky at marklogic.com
Thu May 23 16:14:21 PDT 2013


Is there always one <value> per metadata?  If so, you might be able to fragment on metadata, put a range index on value, and then do something like this:

cts:sum-aggregate(cts:element-reference(xs:QName("value"), (),
  cts:element-word-query(xs:QName("country"), (), "usa") )

This should give you the sum of all values in fragments matching that cts:query.

Do you want to sort them by value or by country?  Either way, having a range index on whichever you want to sort on, you should be able to do that using the range index optimization that I mentioned earlier (the flowr with a positional predicate and an order by on a key that has a range index).

-Danny

From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Abhishek53 S
Sent: Thursday, May 23, 2013 12:36 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Sorting on aggregation in ML4.2-7
Importance: High

Hi Danny,

You are right, I need sorting based on the "summation of all value items". The perrformance is not acceptable with for loop or cts:search since 1.5 millions of documents required to get sorted. I was planning to keep the summation of value in some seperate field but the issue more complex where I need to dynamically summation the value, which means that sum of all <value/> where <country> is USA.

Can any type of document model change help here?

Thanks
Abhishek Srivastav
Tata Consultancy Services
Cell:- +91-9883389968
Mailto: abhishek53.s at tcs.com<mailto:abhishek53.s at tcs.com>
Website: http://www.tcs.com<http://www.tcs.com/>
____________________________________________
Experience certainty. IT Services
Business Solutions
Consulting
____________________________________________

-----general-bounces at developer.marklogic.com<mailto:-----general-bounces at developer.marklogic.com> wrote: -----
To: MarkLogic Developer Discussion <general at developer.marklogic.com<mailto:general at developer.marklogic.com>>
From: Danny Sokolsky <Danny.Sokolsky at marklogic.com<mailto:Danny.Sokolsky at marklogic.com>>
Sent by: general-bounces at developer.marklogic.com<mailto:general-bounces at developer.marklogic.com>
Date: 05/23/2013 02:07PM
Subject: Re: [MarkLogic Dev General] Sorting on aggregation in ML4.2-7
Hi Abeshek,

First off, I am not totally sure what you mean by "summation value of one field", but I am guessing you mean to sum up all of the value elements under your content element.  To start, the naive way to do this is to just do a flowr with an order by, something like this:

xquery version "1.0-ml";
let $node := <content>
<id>C1</id>
<metadata>
<country>USA</country>
<value>200</value>
</metadata>
<metadata>
<country>CAN</country>
<value>400</value>
</metadata>
<metadata>
<country>AUS</country>
<value>300</value>
</metadata>
</content>
let $node2 := <content>
<id>C1</id>
<metadata>
<country>USA</country>
<value>200</value>
</metadata>
<metadata>
<country>CAN</country>
<value>400</value>
</metadata>
<metadata>
<country>AUS</country>
<value>500</value>
</metadata>
</content>

for $x in ($node, $node2)
order by fn:sum(fn:data($x//value)) descending
return $x

Now that might give an expanded tree cache error, not sure, depends on how much memory you have.

Now if that is not fast enough (or fills up the cache), then you can try to create another element that calculates the sum of each value, then put a numeric typed range index on that element, and then the order by will be fast if you take a subset of it.  Something like:

(for $x in $nodes
order by $x/sum-of-values
return $x)[1 to 10]

So there are a few places to start.

-Danny

From: general-bounces at developer.marklogic.com<mailto:general-bounces at developer.marklogic.com> [mailto:general-bounces at developer.marklogic.com] On Behalf Of Abhishek53 S
Sent: Thursday, May 23, 2013 10:54 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Sorting on aggregation in ML4.2-7
Importance: High


Hi All,

I need to do some sorting over summation value of one field in MarkLogic 4.2-7.

Below is the way our content is organized and the sorting is required over sum of all value tag over 1.5 million document

Please advise!!!

<content>

<id>C1</id>

<metadata>

<country>USA</country>

<value>200</value>

</metadata>

<metadata>

<country>CAN</country>

<value>400</value>

</metadata>

<metadata>

<country>AUS</country>

<value>300</value>

</metadata>

</content>


Abhishek Srivastav
Tata Consultancy Services
Cell:- +91-9883389968
Mailto: abhishek53.s at tcs.com<mailto:abhishek53.s at tcs.com>
Website: http://www.tcs.com<http://www.tcs.com/>
____________________________________________
Experience certainty. IT Services
Business Solutions
Consulting
____________________________________________

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
_______________________________________________
General mailing list
General at developer.marklogic.com<mailto:General at developer.marklogic.com>
http://developer.marklogic.com/mailman/listinfo/general

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20130523/d3184540/attachment-0001.html 


More information about the General mailing list