[MarkLogic Dev General] xpath string construction
rob at koberg.com
Wed Oct 15 06:57:18 PDT 2008
To me, this is the same as locking the file, except that you are
possibly letting someone spend wasted time editing a doc only to lose
their changes if not up-to-date. As you say it is rare, but just wait
till you hear from someone who spends 10 minutes editing a file only
to see all the work lost.
On Oct 15, 2008, at 9:41 AM, Eric Palmitesta wrote:
> Good morning all! Sorry to cause such a stir. Upon reading your
> responses, I feel you've gotten the wrong idea, which is probably
> due to communication failure on my part.
> My idea of sequential ids is one 'special' document, for example /
> id.xml, which contains nothing but <id>42</id>, and an id() function
> which exclusive-locks the file, yanks 42 out, increments it,
> replaces the text node with 43, and unlocks the file. My
> environment is read-heavy, write-light, so although write operations
> which require a unique id would touch this file, I don't think it
> would be an awful bottleneck. This guaranteed unique ids without
> having to ever worry about collisions.
> Of course, the counter-argument is that since it's a write-light
> environment, the chances of using random() and lighting striking
> twice, as Michael put it, are infinitesimally small. I don't truly
> have a problem with using random ids, I'm just saying it's worth
> noting that it is *impossible* for lighting to strike twice with
> sequential ids.
> Wayne Feick wrote:
>> Hi Eric,
>> A disadvantage of sequential ids is that you can end up read
>> locking all of your documents in order to find the current max id.
>> You can address this partially by moving the next id into a
>> separate document, but that document can still become a bottleneck
>> if you have a high insertion rate. You could also address this by
>> creating a range index on the id and using cts:element-values() or
>> cts:element-attribute-values() to find the max.
>> By switching to random ids, you get better parallelism since our
>> indexes can quickly determine if the id is already in use and will
>> lock at most one document (or 0 if your existing id search is
>> unfiltered). There is still a vanishingly small probability that
>> two competing threads would allocate the same random id at the same
>> moment in time, but that is improbable enough to be ignored.
>> On Tue, 2008-10-14 at 13:07 -0400, Eric Palmitesta wrote:
>>> Wow, thanks for the reply, Michael. I'll probably be using some
>>> variation of one of your examples.
>>> Michael Blakeley wrote:
>>> > Many people ask about sequential ids. It is possible to model an
>>> id > sequence as a database document. But as with RDBMS sequences,
>>> there are > serialization penalties. I don't see the advantage of
>>> sequential ids, so > I rarely, if ever, use this approach.
>>> Assuming the recursive check isn't feasible (it doesn't scale
>>> well), the advantage of sequential ids is being able to sleep at
>>> night knowing collisions are simply impossible, and are not
>>> reliant on a 'good-enough' random() function. I'm nit-picking of
>>> course, I'm sure random() is fine. :)
>>> General mailing list
>>> General at developer.marklogic.com <mailto:General at developer.marklogic.com
>> General mailing list
>> General at developer.marklogic.com
> General mailing list
> General at developer.marklogic.com
More information about the General