[MarkLogic Dev General] Performance of fn:exists(fn:doc($uri))

Danny Sinang d.sinang at gmail.com
Thu Aug 30 13:49:14 PDT 2012


Hi Geert,

I was pertaining to an element value that's in more than 1 document.

I tried this but it took so long it timed out :

for $assetId in (cts:element-values(xs:QName("assetId")))
let $count := xdmp:estimate(cts:search(/asset,
cts:element-value-query(xs:QName("assetId"), $assetId)))
return

if ($count > 1) then

   fn:concat($assetId, ", ", $count)

else

   ()


Any way to improve on this ?

Regards,
Danny

On Thu, Aug 30, 2012 at 3:21 PM, Geert Josten <geert.josten at dayon.nl> wrote:

> Hi Danny,
>
>
>
> Are you talking about duplicate uri’s? That is normally not possible. If
> you mean some element value that occurs in more than one document, do
> something like this:
>
>
>
> xdmp:estimate(cts:search(doc(),
> cts:element-value-query(xs:QName(‘myelem’), ‘myid’)))
>
>
>
> Kind regards,
>
> Geert
>
>
>
> *Van:* general-bounces at developer.marklogic.com [mailto:
> general-bounces at developer.marklogic.com] *Namens *Danny Sinang
> *Verzonden:* donderdag 30 augustus 2012 21:13
> *Aan:* MarkLogic Developer Discussion
> *Onderwerp:* Re: [MarkLogic Dev General] Performance of
> fn:exists(fn:doc($uri))
>
>
>
> What I meant was, what's the fastest way to check for documents with
> duplicate id's ?
>
> On Thu, Aug 30, 2012 at 3:00 PM, Danny Sinang <d.sinang at gmail.com> wrote:
>
> Can anyone recommend a fast way to check for duplicate id's ?
>
> On Thu, Aug 30, 2012 at 2:06 PM, Geert Josten <geert.josten at dayon.nl>
> wrote:
>
> That read-locks are URI locks, not fragment locks is something I didn't
> know. Sounds excellent, should have known earlier..
>
> And now the internal code MarkLogic uses to generate id's for all its
> internal objects makes much more sense too..
>
> Mike wrote:
> > To put it more simply: how are you going to guarantee the uniqueness of
> the URI, if not by checking to see if it exists?
>
> I can only think of one other way, by using a write lock on a fixed uri
> (or several fixed uri's), like always doing a
> xdmp:document-insert('/assets/lock', <x/>) before deriving a new uri. But
> that slows down creation processes, likely more than using the read-lock
> approach. :-/
>
> That leaves perhaps only one thing that need attention. If you already
> have many documents, then the likeliness random comes up with an id that
> already exists increases. The average number of attempts it needs to take
> to find an unused number increases over time too. Luckely the range of
> random is very large (20 digits), so you really need quite a very lot of
> documents to even get close to 1/100000 of the space..
>
> :)
>
> Grtz,
> Geert
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
>
>
>
>
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20120830/f4abe8ff/attachment-0001.html 


More information about the General mailing list