[MarkLogic Dev General] Performance of fn:exists(fn:doc($uri))

Danny Sinang d.sinang at gmail.com
Thu Aug 30 08:15:59 PDT 2012


Hi Mike,

When I call *addAsset *(see code below), will the read lock that happens
inside *xutils:choose-uri()* be upgraded to a write lock by the time *
xdmp:document-insert()* is called inside *updateAsset()* ?

Regards,
Danny

declare function xutils:choose-uri() as xs:string
{
   let $uri := xutils:buildUri(xdmp:random(), "asset")
   return
          if (fn:exists(fn:doc($uri))) then
              choose-uri()
          else
              $uri
};

declare function xutils:assets-uuid() {
    let $uri := xutils:choose-uri()
    let $assetId :=
            fn:replace(fn:replace(
            fn:base-uri($uri),
            "/assets/","")
            ,"\.xml","")
    return $assetId
};

declare function addAsset($doc as element(asset), $user as xs:string) {
let $assetId := xutils:assets-uuid()
 return
updateAsset($assetId, $doc, $user)
};

declare function updateAsset($assetId as xs:string, $doc as element(asset),
$user as xs:string) {
        (: cut out some code for brevity :)

let $insert := xdmp:document-insert($assetUri, $assetDoc,
xdmp:default-permissions(), vars:getCollections("assets"), 0,
vars:forest-ids("assets"))
 return $assetId
};


On Thu, Aug 30, 2012 at 11:05 AM, Michael Blakeley <mike at blakeley.com>wrote:

> These are URI locks, not fragment locks. The URI doesn't have to exist in
> order to create the lock. The point is to guarantee read-consistency for
> the update, so that the if-then-else expression operates reliably.
>
> The case where the URI does exists would be vanishingly rare, since
> xdmp:random() returns a 64-bit pseudo-random unsigned long. You could test
> the cost by using a smaller random space, if you were interested. But you
> can't simply drop the read locks without sacrificing the guarantee of
> uniqueness. So if you do end up taking extra read locks, they are quite
> necessary.
>
> To put it more simply: how are you going to guarantee the uniqueness of
> the URI, if not by checking to see if it exists?
>
> -- Mike
>
> On 29 Aug 2012, at 22:25 , Geert Josten wrote:
>
> > Hi Mike,
> >
> > Not quite sure, but the conflict occurs when the uri doesn't exist yet,
> so
> > there would be nothing to lock. Does that still create a read-lock?
> >
> > And in case the uri does exist, wouldn't this create potentially a lot of
> > unnecessary read-locks (in case it takes a lot of attempts to find an
> > unused uri)?
> >
> > Kind regards,
> > Geert
> >
> > -----Oorspronkelijk bericht-----
> > Van: general-bounces at developer.marklogic.com
> > [mailto:general-bounces at developer.marklogic.com] Namens Michael Blakeley
> > Verzonden: woensdag 29 augustus 2012 21:35
> > Aan: MarkLogic Developer Discussion
> > Onderwerp: Re: [MarkLogic Dev General] Performance of
> > fn:exists(fn:doc($uri))
> >
> > No, you can't do that safely because cts:uris-match won't take a
> > read-lock. You are opening yourself up to a race condition. And in some
> > circumstances it will be slower than the recommended technique. There
> > seems to be a popular idea that cts:uris-match() is always fastest, but
> > that is not always true.
> >
> > The recommended technique is probably the fastest way to guarantee a new,
> > unique URI. If you are going through the process of inserting a new
> > document, this technique adds very little extra work. The document-insert
> > itself always has to look for an existing document, because it might be
> > replacing an existing document or it might be inserting a new document.
> It
> > always has to write-lock the URI. So the extra exists() call merely
> > repeats the URI lookup, which is cheap because it will be cached for the
> > xdmp:document-insert call, and also gets a read-lock before
> > xdmp:document-insert gets the write lock. In the vanishingly rare event
> > that xdmp:random() produces an existing URI, this extra work is repeated
> -
> > but is still quite cheap.
> >
> > -- Mike
> >
> > On 29 Aug 2012, at 12:29 , William Merritt Sawyer wrote:
> >
> >> If you have the uri-lexicon turned on you can use
> > cts:uri-match(fn:concat("/document-", xdmp:random(), ".xml"))
> >>
> >> From: general-bounces at developer.marklogic.com
> > [mailto:general-bounces at developer.marklogic.com] On Behalf OfDanny
> Sinang
> >> Sent: Wednesday, August 29, 2012 12:33 PM
> >> To: MarkLogic Developer Discussion
> >> Subject: Re: [MarkLogic Dev General] Performance of
> > fn:exists(fn:doc($uri))
> >>
> >> Thanks Geert.
> >>
> >> I did try fn:exists(fn:doc($uri))  on CQ before your response came in
> > and found it to be fast.
> >>
> >> The locking / prevention of duplicate id's is discussed in
> > http://markmail.org/message/mm5vtacpdzwfy44j  .
> >>
> >> Regards,
> >> Danny
> >>
> >> On Wed, Aug 29, 2012 at 2:23 PM, Geert Josten <geert.josten at dayon.nl>
> > wrote:
> >> Hi Danny,
> >>
> >> Performance should be easy to measure. Call the function from within
> > QConsole x number of time and request profile output. Do the same while
> > using xdmp:exists instead of fn:exists. That function works only on
> > (partially) searchable expression, because it doesn't retrieve the actual
> > content. It won't create a read-lock either, but I'm not sure why you
> want
> > one. It won't prevent duplicate id's from being generated in concurrent
> > requests..
> >>
> >> Kind regards,
> >> Geert
> >>
> >> Van: general-bounces at developer.marklogic.com
> > [mailto:general-bounces at developer.marklogic.com] Namens Danny Sinang
> >> Verzonden: woensdag 29 augustus 2012 19:11
> >> Aan: general
> >> Onderwerp: [MarkLogic Dev General] Performance of
> > fn:exists(fn:doc($uri))
> >>
> >> Hi,
> >>
> >> ML support suggested we do this to generate a unique ID for our
> > documents :
> >>
> >> declare function choose-uri() as xs:string
> >>    {
> >>       let $uri := fn:concat("/document-", xdmp:random(), ".xml")
> >>       return if (fn:exists(fn:doc($uri))) then choose-uri() else $uri
> >>    };
> >>
> >> My question is, will the call to fn:exists(fn:doc($uri)) be fast,
> > considering that we now have 8 million documents ?
> >>
> >> The fn:exists(fn:doc($uri)) call is needed to obtain a read lock, which
> > will be upgraded to a write lock when xdmp:document-insert is called.
> >>
> >> Regards,
> >> Danny
> >>
> >>
> >>
> >> _______________________________________________
> >> General mailing list
> >> General at developer.marklogic.com
> >> http://developer.marklogic.com/mailman/listinfo/general
> >>
> >>
> >> _______________________________________________
> >> General mailing list
> >> General at developer.marklogic.com
> >> http://developer.marklogic.com/mailman/listinfo/general
> >
> > _______________________________________________
> > General mailing list
> > General at developer.marklogic.com
> > http://developer.marklogic.com/mailman/listinfo/general
> > _______________________________________________
> > General mailing list
> > General at developer.marklogic.com
> > http://developer.marklogic.com/mailman/listinfo/general
> >
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20120830/f12ef4a3/attachment.html 


More information about the General mailing list