[MarkLogic Dev General] RE: [MarkLogic
Dev General]ServerConnectionException-consistantly after
about 20, 000 files
Lee, David
dlee at epocrates.com
Mon Mar 15 09:30:40 PST 2010
Thanks Mike, great idea about the GC.
I have JProfiler so I can do good detailed measurements ... just takes a while because the errors dont start showing up for 2+ hours ... but good path to investigate !
Thank you
-David
-----Original Message-----
From: Michael Blakeley [mailto:michael.blakeley at marklogic.com]
Sent: Monday, March 15, 2010 1:18 PM
To: General Mark Logic Developer Discussion
Cc: Lee, David
Subject: Re: [MarkLogic Dev General]ServerConnectionException-consistantly after about 20, 000 files
Lee, note that that's TIME_WAIT *not* TIMED_WAIT, and there's no need to
check the server, just the client. Any TIME_WAIT sockets will disappear
fairly quickly: the best way to check is during the test itself.
Have you thought about garbage collection? The fact that the error
occurs after a given number of inserts is suggestive. If the GC thread
locks everything else off the CPU for a long enough period of time, the
server will time out connections. This is especially likely to happen if
the program working set is a large percentage of the java heap size
(which may in turn indicate either leaks that you could fix, or a need
for a larger heap).
You might try instrumenting your code to report insert times in Java,
and also report the elapsed time when you see exceptions. Then monitor
the java process size as your program runs. You may be able to correlate
longer elapsed times for inserts with GC events, which would tend to
confirm this hypothesis.
When using RecordLoader, XQSync, and Corb with large content sets, I
generally use the -XX:+UseConcMarkSweepGC VM option. Sometimes I also
raise the max heap size, but some care is required because too large of
a heap seems to slow things down.
If GC and memory turns out to be involved, I also recommend looking at
the whole Java program carefully, with an eye toward minimizing memory
utilization and especially toward removing any object leaks. If you are
leaking objects, then memory will fill up sooner or later no matter what
GC does. There are some good Java profiling tools available to help with
this.
-- Mike
On 2010-03-15 07:08, Lee, David wrote:
> Thanks Ron, I'm doing all the things you suggest already
> 1) Reusing a Session
> 2) bundling 20 files in 1 insertContent()
> 3) Checked netstat and there are no TIMED_WAIT connections on either
> client or server
>
> I'm trying something different this time which is to use a thread pool
> to try to increase effeciency
> of sending the files. Maybe this will be worse on the system I dont
> know.
> Maybe there is some kind of maximum session open time ?
> The error occur about 2 hours into the transfer typically.
> I could try closing and reopening the session every hour ...
>
> -David
> Server: 4.1-4 on Fedorah fc 11
> Client: XP/Pro SP3 and Windows 7
> XCC: Latest from download
>
>
> -----Original Message-----
> From: general-bounces at developer.marklogic.com
> [mailto:general-bounces at developer.marklogic.com] On Behalf Of Ron
> Hitchens
> Sent: Monday, March 15, 2010 4:49 AM
> To: General Mark Logic Developer Discussion
> Subject: Re: [MarkLogic Dev
> General]ServerConnectionException-consistantly after about 20, 000 files
>
>
> You may be filling up your OS's file table. When a socket
> is closed, the OS holds onto it for a while (default usually
> about two minutes) to reliably detect any straggler packets.
>
> When you cycle a lot of connection quickly, this can max out
> internal data structures in the OS. If you do a netstat and
> see zillions of connections in TIME_WAIT state, that's probably
> what's happening.
>
> If you're connecting across a LAN, this delay is not really
> needed, because it's hard for packet to get rerouted anywhere else.
> You can tune the socket wait time down to 5 seconds or so and that
> will allow file table slots to be re-used more quickly.
>
> You can also insert multiple documents per request, all of
> which will be transferred together and result in fewer low-level
> sockets being opened.
>
> On Mar 14, 2010, at 9:28 PM, Lee, David wrote:
>
>> FYI, here's a stack trace from the same program but in this case its
> the query component under load.
>> This is very consistent as well after about 10 -20k requests
>>
>>
>> com.marklogic.xcc.exceptions.ServerConnectionException: Error parsing
> HTTP headers: Premature EOF, partial header line read: ''
>> [Session: user=DLEE, cb={default} [ContentSource: user=DLEE,
> cb={none} [provider: address=home/192.168.1.10:8011, pool=0/64]]]
>> at
> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(Abs
> tractRequestController.java:99)
>> at
> com.marklogic.xcc.impl.SessionImpl.submitRequest(SessionImpl.java:280)
>> at org.xmlsh.marklogic.put.setChecksum(put.java:341)
>> at org.xmlsh.marklogic.put.flushContent(put.java:315)
>> at org.xmlsh.marklogic.put.putContent(put.java:288)
>> at org.xmlsh.marklogic.put.load(put.java:272)
>> at org.xmlsh.marklogic.put.load(put.java:266)
>> at org.xmlsh.marklogic.put.run(put.java:126)
>> at org.xmlsh.core.XCommand.run(XCommand.java:86)
>> at org.xmlsh.core.XCommand.run(XCommand.java:63)
>> at
> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>> at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>> at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362)
>> at
> org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75)
>> at
> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>> at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>> at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362)
>> at
> org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75)
>> at
> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>> at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>> at
> org.xmlsh.sh.shell.Shell.interactive(Shell.java:461)
>> at org.xmlsh.commands.builtin.xmlsh.run(xmlsh.java:82)
>> at
> org.xmlsh.core.BuiltinCommand.run(BuiltinCommand.java:54)
>> at org.xmlsh.sh.shell.Shell.main(Shell.java:690)
>> Caused by: java.io.IOException: Error parsing HTTP headers: Premature
> EOF, partial header line read: ''
>> at
> com.marklogic.http.HttpHeaders.nextHeaderLine(HttpHeaders.java:326)
>> at
> com.marklogic.http.HttpHeaders.parseResponseHeaders(HttpHeaders.java:287
> )
>> at
> com.marklogic.http.HttpChannel.parseHeaders(HttpChannel.java:323)
>> at
> com.marklogic.http.HttpChannel.receiveMode(HttpChannel.java:293)
>> at
> com.marklogic.http.HttpChannel.getResponseCode(HttpChannel.java:187)
>> at
> com.marklogic.xcc.impl.handlers.EvalRequestController.issueRequest(EvalR
> equestController.java:111)
>> at
> com.marklogic.xcc.impl.handlers.EvalRequestController.serverDialog(EvalR
> equestController.java:62)
>> at
> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(Abs
> tractRequestController.java:72)
>> ... 29 more
>>
>>
>>
>> From: general-bounces at developer.marklogic.com
> [mailto:general-bounces at developer.marklogic.com] On Behalf Of Lee, David
>> Sent: Saturday, March 13, 2010 7:42 PM
>> To: General Mark Logic Developer Discussion
>> Subject: RE: [MarkLogic Dev General]
> ServerConnectionException-consistantly after about 20, 000 files
>>
>> Here's a full stack trace, including my code in the stack.
>> by "opening connections" I mean calling
>>
>> URI serverUri = new URI (connect);
>> ContentSource cs = ContentSourceFactory.newContentSource
> (serverUri);
>>
>> for ever file instead of reusing the ContentSource for all files.
>> Although that may be a red-herring ... when I do it that way (new
> Content Source for each file) I'm not aborting the push operation if one
> file fails so I may be missing these errors in that case.
>>
>> --------- Stack Trace
>>
>>
>>
>> 2010-03-13 16:17:13,748 12310138 ERROR [main] core.SimpleCommand -
> Exception running command: ml:put
>> com.marklogic.xcc.exceptions.ServerConnectionException: An established
> connection was aborted by the software in your host machine
>> [Session: user=DLEE, cb={default} [ContentSource: user=DLEE,
> cb={none} [provider: address=home/192.168.1.10:8011, pool=0/64]]]
>> at
> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(Abs
> tractRequestController.java:99)
>> at
> com.marklogic.xcc.impl.SessionImpl.insertContent(SessionImpl.java:204)
>> at org.xmlsh.marklogic.put.load(put.java:180)
>> at org.xmlsh.marklogic.put.load(put.java:171)
>> at org.xmlsh.marklogic.put.run(put.java:99)
>> at org.xmlsh.core.XCommand.run(XCommand.java:86)
>> at org.xmlsh.core.XCommand.run(XCommand.java:63)
>> at
> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>> at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>> at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362)
>> at
> org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75)
>> at
> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>> at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>> at
> org.xmlsh.sh.shell.Shell.interactive(Shell.java:461)
>> at org.xmlsh.commands.builtin.xmlsh.run(xmlsh.java:82)
>> at
> org.xmlsh.core.BuiltinCommand.run(BuiltinCommand.java:54)
>> at org.xmlsh.sh.shell.Shell.main(Shell.java:690)
>> Caused by: java.io.IOException: An established connection was aborted
> by the software in your host machine
>> at sun.nio.ch.SocketDispatcher.write0(Native Method)
>> at
> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:33)
>> at
> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
>> at sun.nio.ch.IOUtil.write(IOUtil.java:60)
>> at
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>> at
> com.marklogic.http.HttpChannel.writeBuffer(HttpChannel.java:373)
>> at
> com.marklogic.http.HttpChannel.writeBody(HttpChannel.java:353)
>> at
> com.marklogic.http.HttpChannel.flushRequest(HttpChannel.java:346)
>> at
> com.marklogic.http.HttpChannel.write(HttpChannel.java:134)
>> at
> com.marklogic.xcc.impl.handlers.ContentInsertController.writeChunkHeader
> (ContentInsertController.java:299)
>> at
> com.marklogic.xcc.impl.handlers.ContentInsertController.issueRequest(Con
> tentInsertController.java:210)
>> at
> com.marklogic.xcc.impl.handlers.ContentInsertController.serverDialog(Con
> tentInsertController.java:112)
>> at
> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(Abs
> tractRequestController.java:72)
>> ... 20 more
>>
>>
>>
>>
>>
>> From: general-bounces at developer.marklogic.com
> [mailto:general-bounces at developer.marklogic.com] On Behalf Of Sam Neth
>> Sent: Saturday, March 13, 2010 6:08 PM
>> To: General Mark Logic Developer Discussion
>> Subject: Re: [MarkLogic Dev General] ServerConnectionException
> -consistantly after about 20, 000 files
>>
>> Could you post a stack trace?
>>
>> What version of XCC are you using?
>>
>> What specifically are you referring to when you talk about "opening
> connections"?
>>
>> On Mar 13, 2010, at 2:33 PM, Lee, David wrote:
>>
>>
>> If I use XCC to iteratively insert a large set of documents I
> consistently get this error
>>
>> com.marklogic.xcc.exceptions.ServerConnectionException: An established
> connectin was aborted by the software in your host machine [Session:
> user=DLEE, cb={default} [ContentSource: user=DLEE, cb={none} [providr:
> address=home/192.168.1.10:8011, pool=0/64]]]
>>
>>
>> This occurs after about 20,000 files and aborts the program.
>> I'm thinking of implementing a exception handler to retry but I dont
> want to be retrying after more serious errors.
>> The server log doesnt show any problems, and this is on a dedicated
> 1GB wired LAN so I dont think its internet problems.
>>
>> If instead of using the same connection I open the connection for each
> file it often gets around this problem, but not always,
>> I think its getting around it because I'm not aborting on error in
> that case (just going to the next file).
>>
>> I'm using this code snippet to create the content in bulks of 1-20 (
> files in a directory )
>>
>> Content content= ContentFactory.newContent (uri, file,
> mCreateOptions);
>> contents.add(content);
>> ...
>>
>> if( ! contents.isEmpty() )
>> session.insertContent (contents.toArray(new Content[
> contents.size()]));
>>
>>
>>
>> Any suggestions ?
>>
>>
>>
>> ----------------------------------------
>> David A. Lee
>> Senior Principal Software Engineer
>> Epocrates, Inc.
>> dlee at epocrates.com
>> 812-482-5224
>>
>> _______________________________________________
>> General mailing list
>> General at developer.marklogic.com
>> http://xqzone.com/mailman/listinfo/general
>>
>> _______________________________________________
>> General mailing list
>> General at developer.marklogic.com
>> http://xqzone.com/mailman/listinfo/general
>
> ---
> Ron Hitchens {mailto:ron at ronsoft.com} Ronsoft Technologies
> (650) 766-2355 (Home Office) http://www.ronsoft.com
> (707) 924-3878 (fax) Bit Twiddling At Its Finest
> "No amount of belief establishes any fact." -Unknown
>
>
>
>
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general
More information about the General
mailing list