[MarkLogic Dev General] Performance of fn:exists(fn:doc($uri))

Michael Blakeley mike at blakeley.com
Wed Aug 29 12:34:40 PDT 2012


No, you can't do that safely because cts:uris-match won't take a read-lock. You are opening yourself up to a race condition. And in some circumstances it will be slower than the recommended technique. There seems to be a popular idea that cts:uris-match() is always fastest, but that is not always true.

The recommended technique is probably the fastest way to guarantee a new, unique URI. If you are going through the process of inserting a new document, this technique adds very little extra work. The document-insert itself always has to look for an existing document, because it might be replacing an existing document or it might be inserting a new document. It always has to write-lock the URI. So the extra exists() call merely repeats the URI lookup, which is cheap because it will be cached for the xdmp:document-insert call, and also gets a read-lock before xdmp:document-insert gets the write lock. In the vanishingly rare event that xdmp:random() produces an existing URI, this extra work is repeated - but is still quite cheap.

-- Mike

On 29 Aug 2012, at 12:29 , William Merritt Sawyer wrote:

> If you have the uri-lexicon turned on you can use cts:uri-match(fn:concat("/document-", xdmp:random(), ".xml"))
>  
> From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf OfDanny Sinang
> Sent: Wednesday, August 29, 2012 12:33 PM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Performance of fn:exists(fn:doc($uri))
>  
> Thanks Geert.
>  
> I did try fn:exists(fn:doc($uri))  on CQ before your response came in and found it to be fast.
>  
> The locking / prevention of duplicate id's is discussed in  http://markmail.org/message/mm5vtacpdzwfy44j  .
>  
> Regards,
> Danny
> 
> On Wed, Aug 29, 2012 at 2:23 PM, Geert Josten <geert.josten at dayon.nl> wrote:
> Hi Danny,
>  
> Performance should be easy to measure. Call the function from within QConsole x number of time and request profile output. Do the same while using xdmp:exists instead of fn:exists. That function works only on (partially) searchable expression, because it doesn’t retrieve the actual content. It won’t create a read-lock either, but I’m not sure why you want one. It won’t prevent duplicate id’s from being generated in concurrent requests..
>  
> Kind regards,
> Geert
>  
> Van: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] Namens Danny Sinang
> Verzonden: woensdag 29 augustus 2012 19:11
> Aan: general
> Onderwerp: [MarkLogic Dev General] Performance of fn:exists(fn:doc($uri))
>  
> Hi,
>  
> ML support suggested we do this to generate a unique ID for our documents :
>  
> declare function choose-uri() as xs:string
>     {
>        let $uri := fn:concat("/document-", xdmp:random(), ".xml")
>        return if (fn:exists(fn:doc($uri))) then choose-uri() else $uri
>     };
>  
> My question is, will the call to fn:exists(fn:doc($uri)) be fast, considering that we now have 8 million documents ?
>  
> The fn:exists(fn:doc($uri)) call is needed to obtain a read lock, which will be upgraded to a write lock when xdmp:document-insert is called.
>  
> Regards,
> Danny
>  
>  
> 
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
> 
>  
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general



More information about the General mailing list