[MarkLogic Dev General] results returning faster when using string instead of integer in predicate

semerau at hotmail.com semerau at hotmail.com
Fri Oct 21 12:28:33 PDT 2011


That's similar to my understanding too, yet the performance was the same except for the triple compound. Maybe that's just my particular data and test, and does not reflect other cases.

From: wthompson at jonesmcclure.com
To: general at developer.marklogic.com
Date: Fri, 21 Oct 2011 19:07:57 +0000
Subject: Re: [MarkLogic Dev General] results returning faster when using string instead of integer in predicate











My understanding is that it’s the compound predicate that we’re supposed to avoid, equating to something like another join in the logic.

 
-W
 


From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com]
On Behalf Of semerau at hotmail.com

Sent: Thursday, October 20, 2011 10:33 AM

To: general at developer.marklogic.com

Subject: Re: [MarkLogic Dev General] results returning faster when using string instead of integer in predicate


 

Ok that makes sense.



Also, slightly different, but interesting to me, is the speed results from compound predicates:





/image[sizes[size/@w = "98"]]



~ PT0.016S





/image[sizes[size[@w = "98"]]]



~ PT0.031S





Either having a third level compound predicate or having an attribute as the first step in the predicate makes the difference in this case. Not sure which.





> From: mike at blakeley.com

> Date: Thu, 20 Oct 2011 10:27:40 -0700

> To: general at developer.marklogic.com

> Subject: Re: [MarkLogic Dev General] results returning faster when using string instead of integer in predicate

> 

> Consider this:

> 

> attribute w { 123.0 } = 123

> => true

> 

> If the RHS 123 were converted to string, this would return false because "123.0" != "123". So 123 isn't being converted to string. Rather, every @w must atomize to a numeric type. That seems to be a little more expensive than atomization to string.

> 

> -- Mike

> 

> On 20 Oct 2011, at 10:21 , semerau at hotmail.com wrote:

> 

> > Given a few thousand XML files in the DB that look like this:

> > 

> > <thing>

> > <sizes>

> > <size w="123"/>

> > <size w="456"/>

> > </sizes>

> > </thing>

> > 

> > with no schema being used, I get different speed results depending on whether I use a string or integer value in the predicate:

> > 

> > /thing[sizes/size/@w = 123]

> > 

> > ~ PT0.031S (average)

> > 

> > 

> > /thing[sizes/size/@w = "123"]

> > 

> > ~ PT0.016S (average)

> > 

> > 

> > Is this because the values in the XML file are stored are a string (codepoint values) and when the predicate uses an integer it must be converted into a string? Or is there something else going on?

> > 

> > thanks,

> > Ryan 

> > _______________________________________________

> > General mailing list

> > General at developer.marklogic.com

> > http://developer.marklogic.com/mailman/listinfo/general

> 

> _______________________________________________

> General mailing list

> General at developer.marklogic.com

> http://developer.marklogic.com/mailman/listinfo/general






_______________________________________________
General mailing list
General at developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20111021/42cb0399/attachment.html 


More information about the General mailing list