[MarkLogic Dev General] my link resolver is slow

Mike Sokolov sokolov at ifactory.com
Wed Aug 8 10:40:31 PDT 2012

I've written some code to resolve links in a batch process; the links 
can point to a number of different element/@id in any document, and we 
are trying to record the destination document uri with the link so it 
can be rendered quickly at run-time, and missing links won't be rendered 
at all.

Basically the process is: for each of some batch of documents, for each 
of its links, search for the matching document, and replace the link 
with an element having a uri attribute pointing to that document.

Overall, this process is running much slower than I had expected.  I've 
been examining the query using the profiler, and after doing some 
optimization of the searches, I find something a bit strange.  The 
breakdown reported by the optimizer doesn't seem to account for the 
total time.  It looks to me as if all the searches are completing fairly 
quickly, based on logging statements that indicate all the documents in 
the batch have been "processed", and then the query just seems to hang 
for a while before returning.  It seems to spend about 90% of the total 
time in this second stage.  My assumption is this time is spent 
performing the updates, committing, indexing, writing a journal file, or 
something like that.

My question is: should I expect this to be reflected in the optimizer?  
And is there some way I can figure out why it is taking so long, and 
what I can do about it?  Maybe inserting a node would be faster than 
replacing?  I've tried a tree-walk rather than lots of node-replaces, 
but that actually seemed quite a bit slower.

Thanks for any suggestions!

Michael Sokolov
Engineering Director

More information about the General mailing list