[MarkLogic Dev General] "Joins" in search:search or cts:search
Lee, David
dlee at epocrates.com
Thu Nov 17 11:41:15 PST 2011
I suspect the answer is "no" ... but just plugging the brains out there ..
For good or bad I use this architype.
I have many "summary" documents say "/logs/1.xml" , "/logs/2.xml" which belongs to the collection "/summaries"
There can be many (100k+)
Each summary document lists a refernce to external URL's (in this case Amazon S3) from which data could be loaded.
If I load the data I put each group into a collection named by the URL of the summary.
So say I have 10,000 XML documents referenced by doc("/logs/1.xml") If I choose to load them, they will end up in collection
"/logs/1.xml". These summaries are in the collection say "/summaries"
The reason for this is for the ability to easily bulk delete blocks of documents based on their summaries.
I can list the summaries and by a simple
exists( collection( $url) )
cant tell if any actual log documents have been loaded.
NOW: I want to be able to delete all records by summary but only if the documents have been loaded.
Suppose I had 100k summary URL's I could do
for $url in collection("/summaries")
if( exists( collection( $url) ) then
xdmp:collection-delete($url)
else ()
This works and all ... but suppose I want something more efficiient.
Overall there may be only say 1% of the summary documents actually loaded. Furthermore if there were LOTS of ones loaded the above would timeout.
So I spawn a thread to delete say [1 to 10] of every summary collection ...
but say I have 100k collections most of the threads do nothing.
So I have to revert to the above to first check if the collection has anything before spawning a thread.
Quesiton: Is there a cts:search option which can do a collection query based on the results of the search itself ?
that is (pseudo code)
in one cts:search
for $c in collection("x")/document-uri(.)
if( exists( collection( $c) )
return $c
doing this in FLOWR is very slow ...
but its what I'm resorting to ....
----------------------------------------
David A. Lee
Senior Principal Software Engineer
Epocrates, Inc.
dlee at epocrates.com<mailto:dlee at epocrates.com>
812-482-5224
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20111117/7972ecf7/attachment-0001.html
More information about the General
mailing list