[MarkLogic Dev General] xpath string construction

Robert Koberg rob at koberg.com
Tue Oct 14 11:29:02 PDT 2008


What I don't get is why come up with a unique identifier scheme when  
you already have URIs. If the problem is trying to avoid the same doc  
getting update from 2 different directions at the same time, then  
wouldn't ML's transaction management be the solution?

-Rob


On Oct 14, 2008, at 2:18 PM, Geert Josten wrote:

> Fun to see how such a seemingly simple question can generate so much  
> respons.. :-)
>
> Eric,
>
> One simple question: do you need to have unique id's per document,  
> or over the whole database? I was assuming the first, thinking of  
> your scaffolding application..
>
> Kind regards,
> Geert
>
>>
>
>
> Drs. G.P.H. Josten
> Consultant
>
>
> http://www.daidalos.nl/
> Daidalos BV
> Source of Innovation
> Hoekeindsehof 1-4
> 2665 JZ Bleiswijk
> Tel.: +31 (0) 10 850 1200
> Fax: +31 (0) 10 850 1199
> http://www.daidalos.nl/
> KvK 27164984
> De informatie - verzonden in of met dit emailbericht - is afkomstig  
> van Daidalos BV en is uitsluitend bestemd voor de geadresseerde.  
> Indien u dit bericht onbedoeld hebt ontvangen, verzoeken wij u het  
> te verwijderen. Aan dit bericht kunnen geen rechten worden ontleend.
>
>
>> From: general-bounces at developer.marklogic.com
>> [mailto:general-bounces at developer.marklogic.com] On Behalf Of
>> Wayne Feick
>> Sent: dinsdag 14 oktober 2008 20:02
>> To: General Mark Logic Developer Discussion
>> Subject: Re: [MarkLogic Dev General] xpath string construction
>>
>> Hi Eric,
>>
>> A disadvantage of sequential ids is that you can end up read
>> locking all of your documents in order to find the current
>> max id. You can address this partially by moving the next id
>> into a separate document, but that document can still become
>> a bottleneck if you have a high insertion rate. You could
>> also address this by creating a range index on the id and
>> using cts:element-values() or cts:element-attribute-values()
>> to find the max.
>>
>> By switching to random ids, you get better parallelism since
>> our indexes can quickly determine if the id is already in use
>> and will lock at most one document (or 0 if your existing id
>> search is unfiltered). There is still a vanishingly small
>> probability that two competing threads would allocate the
>> same random id at the same moment in time, but that is
>> improbable enough to be ignored.
>>
>> Wayne.
>>
>>
>>
>> On Tue, 2008-10-14 at 13:07 -0400, Eric Palmitesta wrote:
>>
>>
>>      Wow, thanks for the reply, Michael.  I'll probably be
>> using some
>>      variation of one of your examples.
>>
>>      Michael Blakeley wrote:
>>> Many people ask about sequential ids. It is possible
>> to model an id
>>> sequence as a database document. But as with RDBMS
>> sequences, there are
>>> serialization penalties. I don't see the advantage of
>> sequential ids, so
>>> I rarely, if ever, use this approach.
>>
>>      Assuming the recursive check isn't feasible (it doesn't
>> scale well), the
>>      advantage of sequential ids is being able to sleep at
>> night knowing
>>      collisions are simply impossible, and are not reliant
>> on a 'good-enough'
>>      random() function.  I'm nit-picking of course, I'm sure
>> random() is
>>      fine.  :)
>>
>>      Cheers,
>>
>>      Eric
>>      _______________________________________________
>>      General mailing list
>>      General at developer.marklogic.com
>>      http://xqzone.com/mailman/listinfo/general
>>
>>
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general



More information about the General mailing list