[MarkLogic Dev General] xpath string construction

Eric Palmitesta eric.palmitesta at utoronto.ca
Wed Oct 15 07:34:19 PDT 2008


Oh, sorry that wasn't clear to begin with...

The problem exposes itself when allowing a user to delete nodes within a 
document.  You'd like to show the user a list of what's currently there, 
and offer something to click which will delete that node.  Something like:

Eric - <a href="...">delete</a>
Rob - <a href="...">delete</a>
Mike - <a href="...">delete</a>

Using the xdmp:path of the node you wish to delete won't work, because 
if they've got two browsers open, and they delete /path/to/node[14] in 
the 1st browser, then in the 2nd browser /path/to/node[20] will actually 
be blowing away what the user sees as the 21st item, not the 20th 
(because the document order has shifted by 1 due to the first deletion).

How to mitigate?  Tack an id="blah" attribute on user-deletable nodes. 
Something like:

<people>
   <person id="5gb4t" name="Eric" />
   <person id="by54e" name="Rob" />
   <person id="vj942" name="Mike" />
</people>

Now one doesn't have to rely on xdmp:path (a bad idea to begin with). 
The ensuing barrage of email had only to do with how one would generate 
this unique id, as MarkLogic provides no facility for this.

Hope that helps,

Eric

Robert Koberg wrote:
> 
> On Oct 15, 2008, at 10:09 AM, Eric Palmitesta wrote:
> 
>> Rob,
>>
>> I think so far we're talking about insertion, not editing.
> 
> 
> OK, that wasn't what I was understanding. And sorry to keep coming back, 
> but I want to understand what I am missing.
> 
> Assuming you don't mean inserting in an existing document (which I 
> understand to be editing), and you are just inserting a new document, 
> how would you have an ID to compare against? And, why isn't a URI good 
> enough?
> 
> best,
> -Rob
> 
> 
>> What you're referring to is a whole other can of worms.  I've 
>> implemented something like a lock-less editor before (java-based 
>> website, nothing to do with xquery) which, upon saving an edited 
>> document, would check to see if the timestamp on the document has 
>> changed while your editing was taking place.  If so, it would hold 
>> onto the data and say "Hey, someone edited and saved the doc you're 
>> editing and trying to save now.  I've recovered your data though, we 
>> can proceed from here".  This was for a relatively low-traffic app, 
>> though.
>>
>> I think someone described something similar to this not too long ago 
>> on this mailing list, although I can't find that email now.
>>
>> Eric
>>
>> Robert Koberg wrote:
>>> Hi again,
>>> To me, this is the same as locking the file, except that you are 
>>> possibly letting someone spend wasted time editing a doc only to lose 
>>> their changes if not up-to-date. As you say it is rare, but just wait 
>>> till you hear from someone who spends 10 minutes editing a file only 
>>> to see all the work lost.
>>> best,
>>> -Rob
>>> On Oct 15, 2008, at 9:41 AM, Eric Palmitesta wrote:
>>>> Good morning all!  Sorry to cause such a stir.  Upon reading your 
>>>> responses, I feel you've gotten the wrong idea, which is probably 
>>>> due to communication failure on my part.
>>>>
>>>> My idea of sequential ids is one 'special' document, for example 
>>>> /id.xml, which contains nothing but <id>42</id>, and an id() 
>>>> function which exclusive-locks the file, yanks 42 out, increments 
>>>> it, replaces the text node with 43, and unlocks the file.  My 
>>>> environment is read-heavy, write-light, so although write operations 
>>>> which require a unique id would touch this file, I don't think it 
>>>> would be an awful bottleneck.  This guaranteed unique ids without 
>>>> having to ever worry about collisions.
>>>>
>>>> Of course, the counter-argument is that since it's a write-light 
>>>> environment, the chances of using random() and lighting striking 
>>>> twice, as Michael put it, are infinitesimally small.  I don't truly 
>>>> have a problem with using random ids, I'm just saying it's worth 
>>>> noting that it is *impossible* for lighting to strike twice with 
>>>> sequential ids.
>>>>
>>>> Eric
>>>>
>>>> Wayne Feick wrote:
>>>>> Hi Eric,
>>>>> A disadvantage of sequential ids is that you can end up read 
>>>>> locking all of your documents in order to find the current max id. 
>>>>> You can address this partially by moving the next id into a 
>>>>> separate document, but that document can still become a bottleneck 
>>>>> if you have a high insertion rate. You could also address this by 
>>>>> creating a range index on the id and using cts:element-values() or 
>>>>> cts:element-attribute-values() to find the max.
>>>>> By switching to random ids, you get better parallelism since our 
>>>>> indexes can quickly determine if the id is already in use and will 
>>>>> lock at most one document (or 0 if your existing id search is 
>>>>> unfiltered). There is still a vanishingly small probability that 
>>>>> two competing threads would allocate the same random id at the same 
>>>>> moment in time, but that is improbable enough to be ignored.
>>>>> Wayne.
>>>>> On Tue, 2008-10-14 at 13:07 -0400, Eric Palmitesta wrote:
>>>>>> Wow, thanks for the reply, Michael.  I'll probably be using some 
>>>>>> variation of one of your examples.
>>>>>>
>>>>>> Michael Blakeley wrote:
>>>>>> > Many people ask about sequential ids. It is possible to model an 
>>>>>> id > sequence as a database document. But as with RDBMS sequences, 
>>>>>> there are > serialization penalties. I don't see the advantage of 
>>>>>> sequential ids, so > I rarely, if ever, use this approach.
>>>>>>
>>>>>> Assuming the recursive check isn't feasible (it doesn't scale 
>>>>>> well), the advantage of sequential ids is being able to sleep at 
>>>>>> night knowing collisions are simply impossible, and are not 
>>>>>> reliant on a 'good-enough' random() function.  I'm nit-picking of 
>>>>>> course, I'm sure random() is fine.  :)
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Eric
>>>>>> _______________________________________________
>>>>>> General mailing list
>>>>>> General at developer.marklogic.com 
>>>>>> <mailto:General at developer.marklogic.com>
>>>>>> http://xqzone.com/mailman/listinfo/general
>>>>> ------------------------------------------------------------------------ 
>>>>>
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> General at developer.marklogic.com
>>>>> http://xqzone.com/mailman/listinfo/general
>>>> _______________________________________________
>>>> General mailing list
>>>> General at developer.marklogic.com
>>>> http://xqzone.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> General at developer.marklogic.com
>>> http://xqzone.com/mailman/listinfo/general
>> _______________________________________________
>> General mailing list
>> General at developer.marklogic.com
>> http://xqzone.com/mailman/listinfo/general
> 
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general


More information about the General mailing list