Alive nodes keep trying to establish a tcp connection with down ones instead of removing them
Scala version: 2.10
Akka version: 2.2-SNAPSHOT
Hi,
I have been trying to run the sample akka-cluster code found at http://doc.akka.io/docs/akka/snapshot/cluster/cluster-usage-scala.html#A_Simple_Cluster_Example, modified to have a single seed node.
Problem scenario:
1. I start a single seed node at port 2551
2. I start another node at random port. Both nodes connect and are up
3. I then kill the second node.
4. Instead of detecting the second node failure and moving it to down state, the following error is logged:
[ERROR] [02/27/2013 13:08:06.487] [ClusterSystem-akka.remote.writer-dispatcher-11] [akka://ClusterSystem/system/endpointManager/endpointWriter-akka.tcp%3A%2F%2FClusterSystem%40127.0.0.1%3A49864-0] Disassociated
akka.remote.EndpointException: Disassociated
and then repeatedly:
[ERROR] [02/27/2013 13:08:07.563] [ClusterSystem-akka.remote.writer-dispatcher-11] [akka://ClusterSystem/system/endpointManager/endpointWriter-akka.tcp%3A%2F%2FClusterSystem%40127.0.0.1%3A49864-0] Association failed with [akka.tcp://ClusterSystem@127.0.0.1:49864]
akka.remote.EndpointException: Association failed with [akka.tcp://ClusterSystem@127.0.0.1:49864]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: no further information
I have left it running for several minutes hoping hoping that it will eventually move that node to removed state, but this error just continues being logged.
Akka version: 2.2-SNAPSHOT
Hi,
I have been trying to run the sample akka-cluster code found at http://doc.akka.io/docs/akka/snapshot/cluster/cluster-usage-scala.html#A_Simple_Cluster_Example, modified to have a single seed node.
Problem scenario:
1. I start a single seed node at port 2551
2. I start another node at random port. Both nodes connect and are up
3. I then kill the second node.
4. Instead of detecting the second node failure and moving it to down state, the following error is logged:
[ERROR] [02/27/2013 13:08:06.487] [ClusterSystem-akka.remote.writer-dispatcher-11] [akka://ClusterSystem/system/endpointManager/endpointWriter-akka.tcp%3A%2F%2FClusterSystem%40127.0.0.1%3A49864-0] Disassociated
akka.remote.EndpointException: Disassociated
and then repeatedly:
[ERROR] [02/27/2013 13:08:07.563] [ClusterSystem-akka.remote.writer-dispatcher-11] [akka://ClusterSystem/system/endpointManager/endpointWriter-akka.tcp%3A%2F%2FClusterSystem%40127.0.0.1%3A49864-0] Association failed with [akka.tcp://ClusterSystem@127.0.0.1:49864]
akka.remote.EndpointException: Association failed with [akka.tcp://ClusterSystem@127.0.0.1:49864]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: no further information
I have left it running for several minutes hoping hoping that it will eventually move that node to removed state, but this error just continues being logged.
Leave a comment
on 2013-03-04 14:39 *
By Patrik Nordwall
Is this possibly the same problem as raised in mailing list: https://groups.google.com/d/topic/akka-user/SCxhiYcaAGU/discussion ?
It will be solved by #2824 and #2826
It will be solved by #2824 and #2826
Great, then I close this ticket, since it will be covered by the other tickets. Thanks for reporting.