Background threads are not cleaned up after ActorSystem initialization failure due to remote provider initialization failure
In a simple app, I create a basic ActorSystem with the remote provider configured. Let's say I start my app once again so that the second run should fail as the remote connection listener port is already in use. I can see the related exception being thrown as expected, however, the background services (e.g. fork-join pool, I/O channel threads) started earlier are not cleaned up after the failure.
In the below simple example, the main thread dies due to the exception but the JVM keeps running as the stuck-in (and thus invalid) background services. You can see a thread dump in the attachment.
Example code:
Example config:
Used environment:
In the below simple example, the main thread dies due to the exception but the JVM keeps running as the stuck-in (and thus invalid) background services. You can see a thread dump in the attachment.
Example code:
import akka.actor.ActorSystem
object RemoteFail {
def start() = {
val system = ActorSystem("system")
println("foo")
}
def main(args: Array[String]) = {
start()
println("bar")
}
}
Example config:
akka {
actor {
provider = "akka.remote.RemoteActorRefProvider"
}
remote {
transport = "akka.remote.netty.NettyRemoteTransport"
netty.tcp {
hostname = "127.0.0.1"
port = 2552
}
}
}
Used environment:
- Windows 8
- JDK 1.7 64-bit
- Scala 2.10
- Akka 2.3.0
Leave a comment
on 2014-03-12 16:06 *
By Csongor Somogyi
Description changed from In a simple app, I create a... to In a simple app, I create a...
Milestone changed from 2.3.1 to Current
I can't reproduce this. I see the correct behavior. I see
akka.remote.RemoteTransportException: Startup failed
Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: /0.0.0.0:2552
Caused by: java.net.BindException: Address already in use
I would guess that you don't see the "foo" and "bar" println because an exception is thrown. You should be able to try-catch that exception to verify.
akka.remote.RemoteTransportException: Startup failed
Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: /0.0.0.0:2552
Caused by: java.net.BindException: Address already in use
I would guess that you don't see the "foo" and "bar" println because an exception is thrown. You should be able to try-catch that exception to verify.
on 2014-03-14 13:50 *
By Patrik Nordwall
by the way, binding to 0.0.0.0 is not supported, but the port conflict is reported independent of that
file:bHvuQ-RDur44kbacwqjQYw
Thread dump of invalid state after ActorSystem initialization failure
Thread dump of invalid state after ActorSystem initialization failure
Hi Patrick, yes, I'm very absent minded and I give you very silly descriptions. :)
However, the problem is that the ActorSystem is not initialized, however, it stays in an intermediate and invalid state where e.g. the fork-join pool and the background threads of I/O channels are kept running and they're not forced to terminate after the initialization failure. Therefore the Java process does not terminate either as it'd be expected. (At least I hope to do so.)
I attached the thread dump. If you find this issue valid then I'll rephrase the bug description as well. :)
However, the problem is that the ActorSystem is not initialized, however, it stays in an intermediate and invalid state where e.g. the fork-join pool and the background threads of I/O channels are kept running and they're not forced to terminate after the initialization failure. Therefore the Java process does not terminate either as it'd be expected. (At least I hope to do so.)
I attached the thread dump. If you find this issue valid then I'll rephrase the bug description as well. :)
ok, that is another thing.
In general it's difficult to cleanup everything when it fails in initialization, but it might be possible to cover this (rather common) case.
Please change the ticket title and description, and re-open the ticket, or create a new one.
In general it's difficult to cleanup everything when it fails in initialization, but it might be possible to cover this (rather common) case.
Please change the ticket title and description, and re-open the ticket, or create a new one.
on 2014-03-17 14:47 *
By Csongor Somogyi
Summary changed from failure of initializing remote subsystem hangs the initialization of the ActorSystem to Background threads are not cleaned up after ActorSystem initialization failure due to remote provider initialization failure
Description changed from In a simple app, I create a... to In a simple app, I create a...
Status changed from Invalid to New
on 2014-03-17 14:50 *
By Csongor Somogyi
Thanks Patrick.
I'm in the very middle of something (with Akka of course) but as I have a little time, I promise to look into the initialization code and provide some suggestions about how to fix it.
I'm in the very middle of something (with Akka of course) but as I have a little time, I promise to look into the initialization code and provide some suggestions about how to fix it.
on 2014-03-19 16:51 *
By Csongor Somogyi
Dear ticket assignee!
I analyzed the code a little.
Namely the RemoteActorRefProvider.scala file around the line 184 (as of release 2.3.0). I think the simplest and most elegant solution is to surround the remote.start() with try-catch and call local.stop() in the catch clause. This should also terminate all actors created for remote administration.
Do you think if it's a good idea or do I miss something?
Regards
I analyzed the code a little.
Namely the RemoteActorRefProvider.scala file around the line 184 (as of release 2.3.0). I think the simplest and most elegant solution is to surround the remote.start() with try-catch and call local.stop() in the catch clause. This should also terminate all actors created for remote administration.
Do you think if it's a good idea or do I miss something?
Regards
on 2014-03-31 19:39 *
By Patrik Nordwall
Thank you for the ticket. I think you are in the right place, but I don't know exactly what is needed to cleanup as much as possible. Please try it and open a pull request if you like.
First, I tried to build Akka on Windows. Since then I managed it, though I found some issues for which I created a separate ticket. This is my first time with GitHub and everything. So let's go step by step. I'd like to first provide the fix for the Windows build issue (#3983). If that's ok, and I'm comfortable with delivering fixes to Akka then I'll start working on the real fix of this issue.