[MarkLogic Dev General] Uneven load distribution on 3-node cluster

Danny Sinang d.sinang at gmail.com
Tue Aug 21 04:36:04 PDT 2012


We have a 3 node ML 4.2-6 cluster.

Since last week, we've seen CPU usage on nodes 2 and 3 skyrocket to around
90% each from 6 to 10 pm, while node 1 used would hit only 30% at peak

Now we've seen an influx of new customers and this could explain why the
sudden load during that period. Moreover, it looks like we need to rewrite
some of our code to reduce CPU usage.

However, what confounds me is why node 1 isn't taking on as much load as
the other nodes. I'm thinking maybe the following events / situations
caused it. Hope somebody here can confirm or point me in the right

1. Expanded Tree Cache Increase / Restart / Seg Fault / Forest Failover

The night before the CPU usage spike, I had to increase the Expanded Tree
Cache for the cluster from 8 GB to 12 GB (i.e. 12288 MB). This of course
caused the ML cluster to automatically restart. After the restart,
everything looked ok from the application perspective. However, two hours
later, node 1 suddenly encountered multiple "XDMP-OLDSTAMP : Timestamp too
old for forest" and "Segmentation fault" errors and caused multiple
restarts. Eventually, the forests on node 1 did a local-disk failover to
node 2. The following day, we decided to "unfailover" the forest on node 2
back to node 1. The database status shows everything's back to normal after
that, except of course the uneven load between the nodes.

Questions :

a. Could the forest "unfailover" have missed to tell the cluster that node
1 is back in business, thus the uneven load ?

b. Could it be that design the database status showing that the
"unfailover" was successful, node 2 is still serving the content that
failed over to it ?

c. Could the 12 GB (12288 MB) expanded tree cache be an "uneven" number
causing the multiple restarts and old timestamp and segmentation fault
errors ?

d. Could the 12 GB expanded tree cache be causing the uneven load across
the 3 nodes ? The expanded tree cache partitions is set to 4.

2. Ingestion of content done only on nodes 2 and 3.

By design, we validate incoming content on node 1 and ingest (i.e.
document-insert) them on nodes 2 and 3. Could this be causing the content
to be saved only on nodes 2 and 3 ? I was informed months ago that ML
automatically saves documents evenly across all the forests making up a
database. In our case, our production database is made up of 3 forests,
each saved in each of the nodes.

Looking at the database status, it looks like the forests containing binary
files have almost the same size, but the text forests have varying sizes as
follows :

text-content-1 : 51 GB
text-conetnt-2 : 74 GB
text-content-3 : 66 GB

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20120821/a74d91e6/attachment.html 

More information about the General mailing list