[MarkLogic Dev General] Performance of fn:exists(fn:doc($uri))

Ryan Dew ryan.j.dew at gmail.com
Thu Aug 30 06:47:09 PDT 2012


It seems like it would be better to focus on using something more likely to be unique than xdmp:random rather than focusing on read locks. 

http://maxdewpoint.blogspot.com/2012/08/generate-unique-ids-for-collision.html

-Ryan Dew 

On Aug 29, 2012, at 11:25 PM, Geert Josten <geert.josten at dayon.nl> wrote:

> Hi Mike,
> 
> Not quite sure, but the conflict occurs when the uri doesn't exist yet, so
> there would be nothing to lock. Does that still create a read-lock?
> 
> And in case the uri does exist, wouldn't this create potentially a lot of
> unnecessary read-locks (in case it takes a lot of attempts to find an
> unused uri)?
> 
> Kind regards,
> Geert
> 
> -----Oorspronkelijk bericht-----
> Van: general-bounces at developer.marklogic.com
> [mailto:general-bounces at developer.marklogic.com] Namens Michael Blakeley
> Verzonden: woensdag 29 augustus 2012 21:35
> Aan: MarkLogic Developer Discussion
> Onderwerp: Re: [MarkLogic Dev General] Performance of
> fn:exists(fn:doc($uri))
> 
> No, you can't do that safely because cts:uris-match won't take a
> read-lock. You are opening yourself up to a race condition. And in some
> circumstances it will be slower than the recommended technique. There
> seems to be a popular idea that cts:uris-match() is always fastest, but
> that is not always true.
> 
> The recommended technique is probably the fastest way to guarantee a new,
> unique URI. If you are going through the process of inserting a new
> document, this technique adds very little extra work. The document-insert
> itself always has to look for an existing document, because it might be
> replacing an existing document or it might be inserting a new document. It
> always has to write-lock the URI. So the extra exists() call merely
> repeats the URI lookup, which is cheap because it will be cached for the
> xdmp:document-insert call, and also gets a read-lock before
> xdmp:document-insert gets the write lock. In the vanishingly rare event
> that xdmp:random() produces an existing URI, this extra work is repeated -
> but is still quite cheap.
> 
> -- Mike
> 
> On 29 Aug 2012, at 12:29 , William Merritt Sawyer wrote:
> 
>> If you have the uri-lexicon turned on you can use
> cts:uri-match(fn:concat("/document-", xdmp:random(), ".xml"))
>> 
>> From: general-bounces at developer.marklogic.com
> [mailto:general-bounces at developer.marklogic.com] On Behalf OfDanny Sinang
>> Sent: Wednesday, August 29, 2012 12:33 PM
>> To: MarkLogic Developer Discussion
>> Subject: Re: [MarkLogic Dev General] Performance of
> fn:exists(fn:doc($uri))
>> 
>> Thanks Geert.
>> 
>> I did try fn:exists(fn:doc($uri))  on CQ before your response came in
> and found it to be fast.
>> 
>> The locking / prevention of duplicate id's is discussed in
> http://markmail.org/message/mm5vtacpdzwfy44j  .
>> 
>> Regards,
>> Danny
>> 
>> On Wed, Aug 29, 2012 at 2:23 PM, Geert Josten <geert.josten at dayon.nl>
> wrote:
>> Hi Danny,
>> 
>> Performance should be easy to measure. Call the function from within
> QConsole x number of time and request profile output. Do the same while
> using xdmp:exists instead of fn:exists. That function works only on
> (partially) searchable expression, because it doesn't retrieve the actual
> content. It won't create a read-lock either, but I'm not sure why you want
> one. It won't prevent duplicate id's from being generated in concurrent
> requests..
>> 
>> Kind regards,
>> Geert
>> 
>> Van: general-bounces at developer.marklogic.com
> [mailto:general-bounces at developer.marklogic.com] Namens Danny Sinang
>> Verzonden: woensdag 29 augustus 2012 19:11
>> Aan: general
>> Onderwerp: [MarkLogic Dev General] Performance of
> fn:exists(fn:doc($uri))
>> 
>> Hi,
>> 
>> ML support suggested we do this to generate a unique ID for our
> documents :
>> 
>> declare function choose-uri() as xs:string
>>    {
>>       let $uri := fn:concat("/document-", xdmp:random(), ".xml")
>>       return if (fn:exists(fn:doc($uri))) then choose-uri() else $uri
>>    };
>> 
>> My question is, will the call to fn:exists(fn:doc($uri)) be fast,
> considering that we now have 8 million documents ?
>> 
>> The fn:exists(fn:doc($uri)) call is needed to obtain a read lock, which
> will be upgraded to a write lock when xdmp:document-insert is called.
>> 
>> Regards,
>> Danny
>> 
>> 
>> 
>> _______________________________________________
>> General mailing list
>> General at developer.marklogic.com
>> http://developer.marklogic.com/mailman/listinfo/general
>> 
>> 
>> _______________________________________________
>> General mailing list
>> General at developer.marklogic.com
>> http://developer.marklogic.com/mailman/listinfo/general
> 
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20120830/6da20193/attachment.html 


More information about the General mailing list