[MarkLogic Dev General] Query Speed

Jonathan Cook - FM&T Jonathan.Cook2 at bbc.co.uk
Sat Aug 11 01:55:10 PDT 2012


I'd suggest following this tutorial as well on how to fine tune your queries:
http://joncook.github.com/blog/2012/02/12/evaluating-mark-logic-xquery-performance/

Which also has some useful links to ML resources.

Jon


-----Original Message-----
From: general-bounces at developer.marklogic.com on behalf of Michael Blakeley
Sent: Fri 10/08/2012 22:13
To: Irvine, Joseph [USA]
Cc: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Query Speed

The distinct-values function doesn't use indexes, and the predicate in your averageFrequency expression may not do what you want it to do. For future reference, you can search the list archives: http://marklogic.markmail.org/search/?q=distinct-values might have been illuminating, and still might be useful for background.

Anyway, instead of distinct-values and avg use these functions, which use range indexes:

http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/apidoc/SearchBuiltins.xml&category;=SearchBuiltins&function;=cts:element-values<http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/apidoc/SearchBuiltins.xml&category=SearchBuiltins&function=cts:element-values>

http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/apidoc/SearchBuiltins.xml&category;=SearchBuiltins&function;=cts:frequency<http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/apidoc/SearchBuiltins.xml&category=SearchBuiltins&function=cts:frequency>

-- Mike

On 10 Aug 2012, at 13:48 , Irvine, Joseph [USA] wrote:

> Hello,
>
> The following query, for example, with roughly 6500 records took 55.34 seconds according to the MarkLogic performance monitor.
>
> I changed the names and scenario, but used the same exact template. Assume in this case that I have 6500 records. Each records the name of the fruit, and how many days it took to get from the field to the supermarket. This query, then, goes through and gets a unique list of the fruits, since there are repeats. Bananas show up in one record as taking 5 days to get to Destination A, while Bananas show up as taking 12 days to get to Destination B. In this query, I want to get a list of all of the fruits represented as having been transported in the 6500 records, and then list the average time that it took for that fruit. We might find for example that Bananas get to the locations quicker because they are air-freighted while Watermelons take longer since they are shipped via ground transportation. That is basically what we are looking for.
>
> QUERY:
>
> for $d in distinct-values(/Data/fruitName)
> return
> <item>
>       <fruitName>{$d}</fruitName>
>       <averageFrequency>{avg(/qdaData/daysToMarket[/qdaData/fruitName = $d])}</averageFrequency>
> </item>
>
> That seems highly excessive to me. Each record is composed of just 30 XML fields. All 6500 records take up less than 30mb disk space cumulatively. The server I am on is running 24gb RAM. I have all of the fields above as indexed database elements within MarkLogic. Pretend the following is a rough version of one of the XML files:
>
> <Data>
>       <fruitName>Banana</fruitName>
>       <daysToMarket>5</daysToMarket>
>       <finalDestination>Phoenix</finalDestination>
>       <origin>Argentina</origin>
>       <weight>5.5</weight>
>       ... etc
> </Data>
>
> Thanks,
> Joseph Irvine
>
>
> -----Original Message-----
> From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Michael Blakeley
> Sent: Friday, August 10, 2012 1:13 PM
> To: MarkLogic Developer Discussion
> Subject: [External] Re: [MarkLogic Dev General] Query Speed
>
> It sounds likely that the .NET queries are not quite the same as the AB queries. If you can identify a slow query, post it to the list and you may get some useful tuning advice.
>
> -- Mike
>
> On 10 Aug 2012, at 09:52 , Irvine, Joseph [USA] wrote:
>
>> Hello,
>>
>> I have a basic .NET client that connects to a MarkLogic database with XCC.
>>
>> I built it around a database that had 1000 XML documents ingested at the time. The queries were quick, almost instantaneous at the time, and the program ran smoothly as a result. I have since increased the size of the database to 20,000 XML documents. While the forms I have through the Application Builder still respond incredibly quickly, the .NET queries are incredible slow, some taking several minutes to run, as opposed to the near instantaneous speed I had before.
>>
>> Is there some way to increase the speed for external queries?
>>
>> Thank you,
>> Joseph Irvine
>> _______________________________________________
>> General mailing list
>> General at developer.marklogic.com
>> http://developer.marklogic.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
>

_______________________________________________
General mailing list
General at developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general




----------------------------

http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

---------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20120811/e1c099ea/attachment-0001.html 


More information about the General mailing list