[MarkLogic Dev General] Performance of fn:exists(fn:doc($uri))
mike at blakeley.com
Thu Aug 30 13:03:45 PDT 2012
Right, and you could combine two 64-bit numbers if that makes the math look better. It's a belt-and-suspenders approach: rely on probabilities to make collisions very unlikely, but take the read-lock anyway to ensure that you don't accidentally overwrite an existing document. You are going to write-lock a URI anyway, so a read-lock on the same URI is not a heavy burden.
I can imagine applications that don't really care about a very small chance of overwriting an existing document, and simply minimize the probability of a collision. But I have been working with banks lately, and they don't much like that idea.
On 30 Aug 2012, at 11:06 , Geert Josten wrote:
> That read-locks are URI locks, not fragment locks is something I didn't
> know. Sounds excellent, should have known earlier..
> And now the internal code MarkLogic uses to generate id's for all its
> internal objects makes much more sense too..
> Mike wrote:
>> To put it more simply: how are you going to guarantee the uniqueness of
> the URI, if not by checking to see if it exists?
> I can only think of one other way, by using a write lock on a fixed uri
> (or several fixed uri's), like always doing a
> xdmp:document-insert('/assets/lock', <x/>) before deriving a new uri. But
> that slows down creation processes, likely more than using the read-lock
> approach. :-/
> That leaves perhaps only one thing that need attention. If you already
> have many documents, then the likeliness random comes up with an id that
> already exists increases. The average number of attempts it needs to take
> to find an unused number increases over time too. Luckely the range of
> random is very large (20 digits), so you really need quite a very lot of
> documents to even get close to 1/100000 of the space..
> General mailing list
> General at developer.marklogic.com
More information about the General