[XQZone General] efficiency question

Jason Hunter jhunter at marklogic.com
Thu Dec 8 00:11:10 PST 2005


To give a complete answer I'd have to see your query and your 
fragmentation policy.

But let me say that in general, the timings should be similar. 
MarkLogic sees the content it holds as fragments.  A small document 
might be a single fragment.  Larger documents may be many fragments.

If you have 100k small documents then you have 100,000 fragments.  If 
you have a large document with 100k fragment root elements under a 
single document root element, you have 100,001 fragments (the extra one 
is the root, consisting basically of 100k pointers).  The behavior of 
the system will be very similar between the cases.

The perk of using separate documents is that it gives you some finer 
grained control to control things held at the document level like 
permissions, locks, properties, collection status, etc.

Of course I'm speaking in general terms.  If your large document has no 
configured fragment roots, then the behavior will be very different (not 
nearly as good as using separate documents which includes in essence a 
fragmentation choice).

-jh-

Charles Blair wrote:

> i'd like to know which is more efficient: querying a document with
> 100,000 elements for an element containing a string, or querying
> 100,000 documents for an element containing a string. is it a wash? i
> tried the former, and it's reasonably fast. i'd rather not experiment
> with the latter unless i have reason to believe it might be at least
> as fast (though for what i'm prototyping, it might make for a better
> implementation). thanks.
> 
> _______________________________________________
> General mailing list
> General at xqzone.marklogic.com
> http://xqzone.com/mailman/listinfo/general
> 



More information about the General mailing list