[MarkLogic Dev General] xpath string construction

Robert Koberg rob at koberg.com
Wed Oct 15 07:24:47 PDT 2008


On Oct 15, 2008, at 10:09 AM, Eric Palmitesta wrote:

> Rob,
>
> I think so far we're talking about insertion, not editing.


OK, that wasn't what I was understanding. And sorry to keep coming  
back, but I want to understand what I am missing.

Assuming you don't mean inserting in an existing document (which I  
understand to be editing), and you are just inserting a new document,  
how would you have an ID to compare against? And, why isn't a URI good  
enough?

best,
-Rob


> What you're referring to is a whole other can of worms.  I've  
> implemented something like a lock-less editor before (java-based  
> website, nothing to do with xquery) which, upon saving an edited  
> document, would check to see if the timestamp on the document has  
> changed while your editing was taking place.  If so, it would hold  
> onto the data and say "Hey, someone edited and saved the doc you're  
> editing and trying to save now.  I've recovered your data though, we  
> can proceed from here".  This was for a relatively low-traffic app,  
> though.
>
> I think someone described something similar to this not too long ago  
> on this mailing list, although I can't find that email now.
>
> Eric
>
> Robert Koberg wrote:
>> Hi again,
>> To me, this is the same as locking the file, except that you are  
>> possibly letting someone spend wasted time editing a doc only to  
>> lose their changes if not up-to-date. As you say it is rare, but  
>> just wait till you hear from someone who spends 10 minutes editing  
>> a file only to see all the work lost.
>> best,
>> -Rob
>> On Oct 15, 2008, at 9:41 AM, Eric Palmitesta wrote:
>>> Good morning all!  Sorry to cause such a stir.  Upon reading your  
>>> responses, I feel you've gotten the wrong idea, which is probably  
>>> due to communication failure on my part.
>>>
>>> My idea of sequential ids is one 'special' document, for example / 
>>> id.xml, which contains nothing but <id>42</id>, and an id()  
>>> function which exclusive-locks the file, yanks 42 out, increments  
>>> it, replaces the text node with 43, and unlocks the file.  My  
>>> environment is read-heavy, write-light, so although write  
>>> operations which require a unique id would touch this file, I  
>>> don't think it would be an awful bottleneck.  This guaranteed  
>>> unique ids without having to ever worry about collisions.
>>>
>>> Of course, the counter-argument is that since it's a write-light  
>>> environment, the chances of using random() and lighting striking  
>>> twice, as Michael put it, are infinitesimally small.  I don't  
>>> truly have a problem with using random ids, I'm just saying it's  
>>> worth noting that it is *impossible* for lighting to strike twice  
>>> with sequential ids.
>>>
>>> Eric
>>>
>>> Wayne Feick wrote:
>>>> Hi Eric,
>>>> A disadvantage of sequential ids is that you can end up read  
>>>> locking all of your documents in order to find the current max  
>>>> id. You can address this partially by moving the next id into a  
>>>> separate document, but that document can still become a  
>>>> bottleneck if you have a high insertion rate. You could also  
>>>> address this by creating a range index on the id and using  
>>>> cts:element-values() or cts:element-attribute-values() to find  
>>>> the max.
>>>> By switching to random ids, you get better parallelism since our  
>>>> indexes can quickly determine if the id is already in use and  
>>>> will lock at most one document (or 0 if your existing id search  
>>>> is unfiltered). There is still a vanishingly small probability  
>>>> that two competing threads would allocate the same random id at  
>>>> the same moment in time, but that is improbable enough to be  
>>>> ignored.
>>>> Wayne.
>>>> On Tue, 2008-10-14 at 13:07 -0400, Eric Palmitesta wrote:
>>>>> Wow, thanks for the reply, Michael.  I'll probably be using some  
>>>>> variation of one of your examples.
>>>>>
>>>>> Michael Blakeley wrote:
>>>>> > Many people ask about sequential ids. It is possible to model  
>>>>> an id > sequence as a database document. But as with RDBMS  
>>>>> sequences, there are > serialization penalties. I don't see the  
>>>>> advantage of sequential ids, so > I rarely, if ever, use this  
>>>>> approach.
>>>>>
>>>>> Assuming the recursive check isn't feasible (it doesn't scale  
>>>>> well), the advantage of sequential ids is being able to sleep at  
>>>>> night knowing collisions are simply impossible, and are not  
>>>>> reliant on a 'good-enough' random() function.  I'm nit-picking  
>>>>> of course, I'm sure random() is fine.  :)
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Eric
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> General at developer.marklogic.com <mailto:General at developer.marklogic.com 
>>>>> >
>>>>> http://xqzone.com/mailman/listinfo/general
>>>> ------------------------------------------------------------------------
>>>> _______________________________________________
>>>> General mailing list
>>>> General at developer.marklogic.com
>>>> http://xqzone.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> General at developer.marklogic.com
>>> http://xqzone.com/mailman/listinfo/general
>> _______________________________________________
>> General mailing list
>> General at developer.marklogic.com
>> http://xqzone.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general



More information about the General mailing list