Skip to content

CN -> BN: Socket Closed #24916

@timfn-hg

Description

@timfn-hg

At approximately 2026-04-09 19:17 UTC, Previewnet consensus nodes had trouble communicating with block node. From the perspective of CN Node 0:

Immediately before the issue, the consensus node was successfully sending block items to the block node:
Image
Note: The above graph is actually a 20 second delta instead of 1 minute delta, just to show a smaller window of time. At no point before the incident did the number of items drop to zero. The minimum was still hundreds of items in a 20 second span.

At 2026-04-09 19:17:11.761 UTC, the consensus node attempted to send a request to the block node, but the request failed due to a socket closed exception:

1775762231797	2026-04-09T19:17:11.797Z	2026-04-09 19:17:11.761 WARN  1356 BlockNodeStreamingConnection - [bn-conn-worker-STR.040843] [STR.040843/rfh01.previewnet.blocknode.hashgraph-devops.com:40840/ACTIVE] Exception caught in connection worker thread (block=1441115, request=1)
java.lang.RuntimeException: Error executing pipeline.onNext()
	at com.hedera.node.app.blocks.impl.streaming.BlockNodeStreamingConnection.sendRequest(BlockNodeStreamingConnection.java:701)
	at com.hedera.node.app.blocks.impl.streaming.BlockNodeStreamingConnection$ConnectionWorkerLoopTask.trySendPendingRequest(BlockNodeStreamingConnection.java:1323)
	at com.hedera.node.app.blocks.impl.streaming.BlockNodeStreamingConnection$ConnectionWorkerLoopTask.maybeSendPendingRequest(BlockNodeStreamingConnection.java:1180)
	at com.hedera.node.app.blocks.impl.streaming.BlockNodeStreamingConnection$ConnectionWorkerLoopTask.doWork(BlockNodeStreamingConnection.java:1140)
	at com.hedera.node.app.blocks.impl.streaming.BlockNodeStreamingConnection$ConnectionWorkerLoopTask.run(BlockNodeStreamingConnection.java:1022)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.UncheckedIOException: java.net.SocketException: Socket closed
	at io.helidon.common.buffers.FixedBufferData.writeTo(FixedBufferData.java:74)
	at io.helidon.common.buffers.CompositeArrayBufferData.writeTo(CompositeArrayBufferData.java:48)
	at io.helidon.common.socket.PlainSocket.write(PlainSocket.java:136)
	at io.helidon.webclient.api.TcpClientConnection$BufferedDataWriter.writeNow(TcpClientConnection.java:368)
	at io.helidon.http.http2.Http2ConnectionWriter.noLockWrite(Http2ConnectionWriter.java:200)
	at io.helidon.http.http2.Http2ConnectionWriter.lockedWrite(Http2ConnectionWriter.java:173)
	at io.helidon.http.http2.Http2ConnectionWriter.splitAndWrite(Http2ConnectionWriter.java:210)
	at io.helidon.http.http2.Http2ConnectionWriter.writeData(Http2ConnectionWriter.java:65)
	at io.helidon.webclient.http2.Http2ClientStream.write(Http2ClientStream.java:507)
	at io.helidon.webclient.http2.Http2ClientStream.splitAndWrite(Http2ClientStream.java:497)
	at io.helidon.webclient.http2.Http2ClientStream.writeData(Http2ClientStream.java:343)
	at com.hedera.pbj.grpc.client.helidon.PbjGrpcCall.sendRequest(PbjGrpcCall.java:136)
	at org.hiero.block.api.BlockStreamPublishServiceInterface$BlockStreamPublishServiceClient$2.onNext(BlockStreamPublishServiceInterface.java:186)
	at org.hiero.block.api.BlockStreamPublishServiceInterface$BlockStreamPublishServiceClient$2.onNext(BlockStreamPublishServiceInterface.java:179)
	at com.hedera.node.app.blocks.impl.streaming.BlockNodeStreamingConnection.lambda$sendRequest$2(BlockNodeStreamingConnection.java:683)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
	at java.base/java.lang.VirtualThread.run(VirtualThread.java:309)
Caused by: java.net.SocketException: Socket closed
	at java.base/sun.nio.ch.NioSocketImpl.ensureOpenAndConnected(NioSocketImpl.java:163)
	at java.base/sun.nio.ch.NioSocketImpl.beginWrite(NioSocketImpl.java:362)
	at java.base/sun.nio.ch.NioSocketImpl.implWrite(NioSocketImpl.java:407)
	at java.base/sun.nio.ch.NioSocketImpl.write(NioSocketImpl.java:440)
	at java.base/sun.nio.ch.NioSocketImpl$2.write(NioSocketImpl.java:819)
	at java.base/java.net.Socket$SocketOutputStream.write(Socket.java:1195)
	at io.helidon.common.buffers.FixedBufferData.writeTo(FixedBufferData.java:71)
	... 17 more

This caused the connection to rfh01 to be closed and a new connection to lfh01 take its place. It looks like the new connection was successful and didn't encounter any errors. But then 30 seconds later when the rescheduled connection to rfh01 was triggered, we attempt to switch back to rfh01. The connection failed again, this time due to the block node being too far behind.

The behind reason is odd... right before the initial connection error was encountered, we were attempting to stream block 1441115. Thirty seconds later, when we tried to stream again to the same node at block 1441128 - or 13 blocks ahead the last time we tried to send something. However, this failed because the consensus node received a BehindPublisher response from the block node and it indicated the last block was 1441098. This is relatively much further back that what should be expected. Regardless, after receiving the behind response, the consensus node switched to stream block 1441099. Streaming continued for another minute - seemingly successfully - until we tried to stream block 1441156, which failed due to another socket closed exception being received.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions