[MarkLogic Dev General] "warming" indexes

Alan Darnell alan.darnell at utoronto.ca
Fri Mar 28 06:54:23 PST 2008


We have recently moved from a single host Mark Logic server to a  
cluster with 4 data nodes and 2 evaluator nodes.  We also increased  
the number of documents in our primary database from 1 million to 13.5  
million.   When we search this cluster (either via CQ or an XQuery  
application we've built), we notice the following behaviour.  If the  
cluster has been sitting idle for a few minutes, a first search will  
take up to 20 seconds to respond.  Subsequent searches on the same  
term or another term take a second or less to respond.   Leave the  
system alone for a few minutes and then run the same searches --  
again, the first search takes about 20 seconds and subsequent searches  
are fast.

I'm not too worried about this behaviour because when we are in  
production the system shouldn't be idle very often.  But it does make  
me wonder why this is happening on an idle system.  I realize that the  
subsequent searches are faster because data from the indexes has been  
moved from disk to memory.  But why doesn't this data stay in memory  
-- what flushes it out and is there any way to keep this data in  
memory?  Do other sites see this same behaviour?  How do they deal  
with it? Do we need to "warm" the indexes periodically by running  
searches against them?


Alan Darnell
University of Toronto





More information about the General mailing list