[MarkLogic Dev General] New Module for Memory Operations on XML
Whitby, Rob, Springer Healthcare UK
rob.whitby at springer.com
Tue Apr 17 13:11:42 PDT 2012
This is really interesting, thanks for sharing it.
I recently encountered really poor performance using the in-mem-update module, and modified it slightly to use fn:generate-id().
In my simple test of deleting nodes, the in-mem-update module takes 13.8s, modifying it to use fn:generate-id() improves this to 0.25s. I just tried your module and got 0.04s! Obviously this is just one use case but it's really impressive nonetheless. Do you have unit tests you could share on github? Or perhaps there are existing tests for in-mem-update that could be applied?
From: general-bounces at developer.marklogic.com on behalf of Ryan Dew
Sent: Tue 4/17/2012 18:17
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] New Module for Memory Operations on XML
The function mapping idea is good. I'm not quite sure how I would
cts:highlight, I'll have to think on that one. I wanted to make it easy for
the module to be fully XQuery 1.0 compatible. Currently I have a commented
out code to replace the functionality of fn:generate-id (an XQuery 3.0
function) to generate a unique id for a node (mine is a little slower, but
the module still provides overall better performance). I might consider
forking it so one version is fully XQuery 1.0 compliant and another is
tailored to MarkLogic.
Thanks for the suggestions!
On Tue, Apr 17, 2012 at 11:03 AM, Michael Blakeley <mike at blakeley.com>wrote:
> Geert, I expect that the xdmp update functions also operate by walking the
> input tree and copying it an output tree. Otherwise how would you have
> multi-version concurrency?
> But the xdmp functions are implemented in C++, which makes a difference.
> You might be able to quantify that difference by comparing
> xmdp:node-replace with the equivalent in-memory operations plus
> xdmp:document-insert. That kind of evidence could help persuade someone at
> MarkLogic that the feature would be worthwhile.
> Ryan, I think you could improve performance even more with judicious use
> of function mapping. It is often faster than FLWOR expressions are. You
> might also see if there is a way to use cts:highlight for some operations,
> since that is a C++ function.
> -- Mike
> On 17 Apr 2012, at 07:02 , Geert Josten wrote:
> > Where can we find the code itself?
> > And how much does it resemble the kind of updates allowed in XQUF?
> > By the way, was kind of hoping MarkLogic would allow applying the xdmp
> node update functions (or copies of those) to in memory structures as well.
> Direct manipulation of the tree, without copying it recursively would be
> way faster..
> > Kind regards,
> > Geert
> > Van: general-bounces at developer.marklogic.com [mailto:
> general-bounces at developer.marklogic.com] Namens Ryan Dew
> > Verzonden: dinsdag 17 april 2012 15:47
> > Aan: MarkLogic Developer Discussion
> > Onderwerp: [MarkLogic Dev General] New Module for Memory Operations on
> > I've been working on my own module for updating XML in memory. It has
> greater functionality than the module shipped with MarkLogic, such as
> performing multiple operations at one time, and better performance from
> what I have been able to measure. You can see my post on it at
> I would love to get some input from the MarkLogic community on this.
> > -Ryan Dew
> > _______________________________________________
> > General mailing list
> > General at developer.marklogic.com
> > http://developer.marklogic.com/mailman/listinfo/general
> General mailing list
> General at developer.marklogic.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the General