[MarkLogic Dev General] Termlist database error

Michael Blakeley mike at blakeley.com
Tue May 28 12:28:24 PDT 2013


https://github.com/mblakele/task-rebalancer is pretty robust now.

-- Mike

On 28 May 2013, at 12:17 , Damon Feldman <Damon.Feldman at marklogic.com> wrote:

> Gary,
>  
> Adding a forest will provide extra space but won’t offload content from the existing forest(s) in MarkLogic version 6 or below. You’ll need to run CoRB or scheduled tasks to re-ingest data or (better) move data from one forest to another by specifying the forest-id in xdmp:document-insert() and re-inserting the documents.
>  
> I’m not sure how to trace the long ID number to a term description, but someone else may know.
>  
> The rebalancing code will be something like this:
> for $u in cts:uris(“”, (), (), 0, $old-forest-ids)[1 to 100]
> let $p := [find the doc’s permissions]
> let $c := [find the doc’s collections]
> let $q := [find the doc’s quality]
> xdmp:document-insert($u, doc($u), $p, $c, $q, $new-forest-ids)
>  
> and you just run it over and over until about 1/Nth of the content is in each forest.
>  
> Someone may have a real script for this that could be posted to this list for posterity.
>  
> Yours,
> Damon
>  
> From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Gary Larsen
> Sent: Tuesday, May 28, 2013 2:53 PM
> To: 'MarkLogic Developer Discussion'
> Subject: Re: [MarkLogic Dev General] Termlist database error
>  
> Damon,
>  
> Thanks for your response.  I will add another forest to see if that helps.  About 5 minutes before that error a Java process got terminated.  I’m guessing it’s related (stack trace below)
>  
> Is there an easy way determine the offending range index or field?  
>  
>  
> Caused by: java.io.IOException: An established connection was aborted by the software in your host machine
>                 at sun.nio.ch.SocketDispatcher.write0(Native Method)
>                 at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:33)
>                 at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
>                 at sun.nio.ch.IOUtil.write(IOUtil.java:26)
>                 at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336)
>                 at com.marklogic.http.HttpChannel.writeBuffer(HttpChannel.java:371)
>                 at com.marklogic.http.HttpChannel.writeBody(HttpChannel.java:353)
>                 at com.marklogic.http.HttpChannel.flushRequest(HttpChannel.java:347)
>                 at com.marklogic.http.HttpChannel.write(HttpChannel.java:136)
>                 at com.marklogic.xcc.impl.handlers.ContentInsertController.issueRequest(ContentInsertController.java:242)
>                 at com.marklogic.xcc.impl.handlers.ContentInsertController.serverDialog(ContentInsertController.java:116)
>                 at com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(AbstractRequestController.java:84)
>  
>  
> Gary
>  
> From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Damon Feldman
> Sent: Tuesday, May 28, 2013 2:34 PM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Termlist database error
>  
> Gary,
>  
> I believe you have a very large forest with many entries for a common word, element or similar. Breaking it up into more forests should fix the problem because each forest will have smaller termlists.
>  
> Once the termlist data is discarded, I think you’ll have to rewrite a lot of data to get the index rebuilt with the positions added back, so I suggest holding off on ingest or other updates until you address this.
>  
> For background, every element, word, word stem, etc. are a “term” and termlists are lists of the documents that hold them.
>  
> You have some very long list, which suggests you are operating outside the ideal parameters of the system. If you post the forest sizes we can confirm that.
>  
> Yours,
> Damon
>  
> --
> Damon Feldman
> Sr. Principal Consultant, MarkLogic
>  
>  
> From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Gary Larsen
> Sent: Tuesday, May 28, 2013 2:32 PM
> To: General MarkLogic Developer Discussion
> Subject: [MarkLogic Dev General] Termlist database error
>  
> Hi,
>  
> Can someone help me understand what this errors means?  Is it serious, something I can fix with a configuration change?
>  
> 2013-05-26 14:14:46.884 Warning: Termlist for 4697283252598410410 in C:\Program Files\MarkLogic\Data\Forests\NetVisn_SB\000003d3 is 248 MB; will discard positions at 256 MB
>  
> Thanks,
> Gary
>  
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general



More information about the General mailing list