[MarkLogic Dev General] Performance of fn:exists(fn:doc($uri))

Danny Sinang d.sinang at gmail.com
Thu Aug 30 14:33:48 PDT 2012


Wish I could use cts:frequency(), but assetId is used in other locations /
elements.

Need to limit my duplicate search to the /asset/assetId element .

Regards,
Danny

On Thu, Aug 30, 2012 at 5:21 PM, William Merritt Sawyer <
william.sawyer at ldschurch.org> wrote:

>  I would think cts:frequency would be faster.  Would be something like
> this:****
>
> ** **
>
> for $value in cts:element-values(xs:QName("assetId"))****
>
> let $frequency := cts:frequency($value)****
>
> where $frequency > 1****
>
> return fn:concat($value, ":", $frequency)****
>
> ** **
>
> -Will****
>
> ** **
>
> *From:* general-bounces at developer.marklogic.com [mailto:
> general-bounces at developer.marklogic.com] *On Behalf Of *Danny Sinang
> *Sent:* Thursday, August 30, 2012 2:49 PM
> *To:* MarkLogic Developer Discussion
> *Subject:* Re: [MarkLogic Dev General] Performance of
> fn:exists(fn:doc($uri))****
>
> ** **
>
> Hi Geert,****
>
> ** **
>
> I was pertaining to an element value that's in more than 1 document.****
>
> ** **
>
> I tried this but it took so long it timed out :****
>
> ** **
>
>  for $assetId in (cts:element-values(xs:QName("assetId")))****
>
> let $count := xdmp:estimate(cts:search(/asset,
> cts:element-value-query(xs:QName("assetId"), $assetId)))****
>
> return ****
>
>   if ($count > 1) then****
>
>      fn:concat($assetId, ", ", $count)****
>
>   else****
>
>      ()****
>
>  ** **
>
> Any way to improve on this ?****
>
> ** **
>
> Regards,****
>
> Danny****
>
> On Thu, Aug 30, 2012 at 3:21 PM, Geert Josten <geert.josten at dayon.nl>
> wrote:****
>
> Hi Danny,****
>
>  ****
>
> Are you talking about duplicate uri’s? That is normally not possible. If
> you mean some element value that occurs in more than one document, do
> something like this:****
>
>  ****
>
> xdmp:estimate(cts:search(doc(),
> cts:element-value-query(xs:QName(‘myelem’), ‘myid’)))****
>
>  ****
>
> Kind regards,****
>
> Geert****
>
>  ****
>
> *Van:* general-bounces at developer.marklogic.com [mailto:
> general-bounces at developer.marklogic.com] *Namens *Danny Sinang
> *Verzonden:* donderdag 30 augustus 2012 21:13
> *Aan:* MarkLogic Developer Discussion
> *Onderwerp:* Re: [MarkLogic Dev General] Performance of
> fn:exists(fn:doc($uri))****
>
>  ****
>
> What I meant was, what's the fastest way to check for documents with
> duplicate id's ?****
>
> On Thu, Aug 30, 2012 at 3:00 PM, Danny Sinang <d.sinang at gmail.com> wrote:*
> ***
>
> Can anyone recommend a fast way to check for duplicate id's ?****
>
> On Thu, Aug 30, 2012 at 2:06 PM, Geert Josten <geert.josten at dayon.nl>
> wrote:****
>
> That read-locks are URI locks, not fragment locks is something I didn't
> know. Sounds excellent, should have known earlier..
>
> And now the internal code MarkLogic uses to generate id's for all its
> internal objects makes much more sense too..
>
> Mike wrote:
> > To put it more simply: how are you going to guarantee the uniqueness of
> the URI, if not by checking to see if it exists?
>
> I can only think of one other way, by using a write lock on a fixed uri
> (or several fixed uri's), like always doing a
> xdmp:document-insert('/assets/lock', <x/>) before deriving a new uri. But
> that slows down creation processes, likely more than using the read-lock
> approach. :-/
>
> That leaves perhaps only one thing that need attention. If you already
> have many documents, then the likeliness random comes up with an id that
> already exists increases. The average number of attempts it needs to take
> to find an unused number increases over time too. Luckely the range of
> random is very large (20 digits), so you really need quite a very lot of
> documents to even get close to 1/100000 of the space..
>
> :)
>
> Grtz,
> Geert
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general****
>
>  ****
>
>  ****
>
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general****
>
> ** **
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20120830/b0e2fcf7/attachment-0001.html 


More information about the General mailing list