[MarkLogic Dev General] xpath string construction

Eric Palmitesta eric.palmitesta at utoronto.ca
Wed Oct 15 06:41:24 PDT 2008


Good morning all!  Sorry to cause such a stir.  Upon reading your 
responses, I feel you've gotten the wrong idea, which is probably due to 
communication failure on my part.

My idea of sequential ids is one 'special' document, for example 
/id.xml, which contains nothing but <id>42</id>, and an id() function 
which exclusive-locks the file, yanks 42 out, increments it, replaces 
the text node with 43, and unlocks the file.  My environment is 
read-heavy, write-light, so although write operations which require a 
unique id would touch this file, I don't think it would be an awful 
bottleneck.  This guaranteed unique ids without having to ever worry 
about collisions.

Of course, the counter-argument is that since it's a write-light 
environment, the chances of using random() and lighting striking twice, 
as Michael put it, are infinitesimally small.  I don't truly have a 
problem with using random ids, I'm just saying it's worth noting that it 
is *impossible* for lighting to strike twice with sequential ids.

Eric

Wayne Feick wrote:
> Hi Eric,
> 
> A disadvantage of sequential ids is that you can end up read locking all 
> of your documents in order to find the current max id. You can address 
> this partially by moving the next id into a separate document, but that 
> document can still become a bottleneck if you have a high insertion 
> rate. You could also address this by creating a range index on the id 
> and using cts:element-values() or cts:element-attribute-values() to find 
> the max.
> 
> By switching to random ids, you get better parallelism since our indexes 
> can quickly determine if the id is already in use and will lock at most 
> one document (or 0 if your existing id search is unfiltered). There is 
> still a vanishingly small probability that two competing threads would 
> allocate the same random id at the same moment in time, but that is 
> improbable enough to be ignored.
> 
> Wayne.
> 
> 
> 
> On Tue, 2008-10-14 at 13:07 -0400, Eric Palmitesta wrote:
>> Wow, thanks for the reply, Michael.  I'll probably be using some 
>> variation of one of your examples.
>>
>> Michael Blakeley wrote:
>> > Many people ask about sequential ids. It is possible to model an id 
>> > sequence as a database document. But as with RDBMS sequences, there are 
>> > serialization penalties. I don't see the advantage of sequential ids, so 
>> > I rarely, if ever, use this approach.
>>
>> Assuming the recursive check isn't feasible (it doesn't scale well), the 
>> advantage of sequential ids is being able to sleep at night knowing 
>> collisions are simply impossible, and are not reliant on a 'good-enough' 
>> random() function.  I'm nit-picking of course, I'm sure random() is 
>> fine.  :)
>>
>> Cheers,
>>
>> Eric
>> _______________________________________________
>> General mailing list
>> General at developer.marklogic.com <mailto:General at developer.marklogic.com>
>> http://xqzone.com/mailman/listinfo/general
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general


More information about the General mailing list