[MarkLogic Dev General] xpath string construction
Robert Koberg
rob at koberg.com
Wed Oct 15 07:24:47 PDT 2008
On Oct 15, 2008, at 10:09 AM, Eric Palmitesta wrote:
> Rob,
>
> I think so far we're talking about insertion, not editing.
OK, that wasn't what I was understanding. And sorry to keep coming
back, but I want to understand what I am missing.
Assuming you don't mean inserting in an existing document (which I
understand to be editing), and you are just inserting a new document,
how would you have an ID to compare against? And, why isn't a URI good
enough?
best,
-Rob
> What you're referring to is a whole other can of worms. I've
> implemented something like a lock-less editor before (java-based
> website, nothing to do with xquery) which, upon saving an edited
> document, would check to see if the timestamp on the document has
> changed while your editing was taking place. If so, it would hold
> onto the data and say "Hey, someone edited and saved the doc you're
> editing and trying to save now. I've recovered your data though, we
> can proceed from here". This was for a relatively low-traffic app,
> though.
>
> I think someone described something similar to this not too long ago
> on this mailing list, although I can't find that email now.
>
> Eric
>
> Robert Koberg wrote:
>> Hi again,
>> To me, this is the same as locking the file, except that you are
>> possibly letting someone spend wasted time editing a doc only to
>> lose their changes if not up-to-date. As you say it is rare, but
>> just wait till you hear from someone who spends 10 minutes editing
>> a file only to see all the work lost.
>> best,
>> -Rob
>> On Oct 15, 2008, at 9:41 AM, Eric Palmitesta wrote:
>>> Good morning all! Sorry to cause such a stir. Upon reading your
>>> responses, I feel you've gotten the wrong idea, which is probably
>>> due to communication failure on my part.
>>>
>>> My idea of sequential ids is one 'special' document, for example /
>>> id.xml, which contains nothing but <id>42</id>, and an id()
>>> function which exclusive-locks the file, yanks 42 out, increments
>>> it, replaces the text node with 43, and unlocks the file. My
>>> environment is read-heavy, write-light, so although write
>>> operations which require a unique id would touch this file, I
>>> don't think it would be an awful bottleneck. This guaranteed
>>> unique ids without having to ever worry about collisions.
>>>
>>> Of course, the counter-argument is that since it's a write-light
>>> environment, the chances of using random() and lighting striking
>>> twice, as Michael put it, are infinitesimally small. I don't
>>> truly have a problem with using random ids, I'm just saying it's
>>> worth noting that it is *impossible* for lighting to strike twice
>>> with sequential ids.
>>>
>>> Eric
>>>
>>> Wayne Feick wrote:
>>>> Hi Eric,
>>>> A disadvantage of sequential ids is that you can end up read
>>>> locking all of your documents in order to find the current max
>>>> id. You can address this partially by moving the next id into a
>>>> separate document, but that document can still become a
>>>> bottleneck if you have a high insertion rate. You could also
>>>> address this by creating a range index on the id and using
>>>> cts:element-values() or cts:element-attribute-values() to find
>>>> the max.
>>>> By switching to random ids, you get better parallelism since our
>>>> indexes can quickly determine if the id is already in use and
>>>> will lock at most one document (or 0 if your existing id search
>>>> is unfiltered). There is still a vanishingly small probability
>>>> that two competing threads would allocate the same random id at
>>>> the same moment in time, but that is improbable enough to be
>>>> ignored.
>>>> Wayne.
>>>> On Tue, 2008-10-14 at 13:07 -0400, Eric Palmitesta wrote:
>>>>> Wow, thanks for the reply, Michael. I'll probably be using some
>>>>> variation of one of your examples.
>>>>>
>>>>> Michael Blakeley wrote:
>>>>> > Many people ask about sequential ids. It is possible to model
>>>>> an id > sequence as a database document. But as with RDBMS
>>>>> sequences, there are > serialization penalties. I don't see the
>>>>> advantage of sequential ids, so > I rarely, if ever, use this
>>>>> approach.
>>>>>
>>>>> Assuming the recursive check isn't feasible (it doesn't scale
>>>>> well), the advantage of sequential ids is being able to sleep at
>>>>> night knowing collisions are simply impossible, and are not
>>>>> reliant on a 'good-enough' random() function. I'm nit-picking
>>>>> of course, I'm sure random() is fine. :)
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Eric
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> General at developer.marklogic.com <mailto:General at developer.marklogic.com
>>>>> >
>>>>> http://xqzone.com/mailman/listinfo/general
>>>> ------------------------------------------------------------------------
>>>> _______________________________________________
>>>> General mailing list
>>>> General at developer.marklogic.com
>>>> http://xqzone.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> General at developer.marklogic.com
>>> http://xqzone.com/mailman/listinfo/general
>> _______________________________________________
>> General mailing list
>> General at developer.marklogic.com
>> http://xqzone.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general
More information about the General
mailing list