Race Problem in Actor.actorOf
The createActor does a check and then act that is not atomic.
private[akka] def createActor(address: String, actorFactory: () ⇒ ActorRef): ActorRef = {
Address.validate(address)
registry.actorFor(address) match { // check if the actor for the address is already in the registry
case Some(actorRef) ⇒ actorRef // it is -> return it
case None ⇒ // it is not -> create it
try {
Deployer.deploymentFor(address) match {
case Deploy(_, router, _, Local) ⇒ actorFactory() // create a local actor
case deploy ⇒ newClusterActorRef(actorFactory, address, deploy)
}
} catch {
case e: DeploymentException ⇒
EventHandler.error(e, this, "Look up deployment for address [%s] falling back to local actor." format address)
actorFactory() // if deployment fails, fall back to local actors
}
}
}
So it checks if the actor exists and if not, it makes the actor. Because this behavior is not atomic (and no 'repairing' is done) there is a data race.
Inportant: The scope of this task is only to get the local stuff race free. See the
https://www.assembla.com/spaces/akka/tickets/1029-race-problem-in-actor-actorof--distributed-fix
for getting the distributed stuff right.
This task was split up so they can be solved independently.
private[akka] def createActor(address: String, actorFactory: () ⇒ ActorRef): ActorRef = {
Address.validate(address)
registry.actorFor(address) match { // check if the actor for the address is already in the registry
case Some(actorRef) ⇒ actorRef // it is -> return it
case None ⇒ // it is not -> create it
try {
Deployer.deploymentFor(address) match {
case Deploy(_, router, _, Local) ⇒ actorFactory() // create a local actor
case deploy ⇒ newClusterActorRef(actorFactory, address, deploy)
}
} catch {
case e: DeploymentException ⇒
EventHandler.error(e, this, "Look up deployment for address [%s] falling back to local actor." format address)
actorFactory() // if deployment fails, fall back to local actors
}
}
}
So it checks if the actor exists and if not, it makes the actor. Because this behavior is not atomic (and no 'repairing' is done) there is a data race.
Inportant: The scope of this task is only to get the local stuff race free. See the
https://www.assembla.com/spaces/akka/tickets/1029-race-problem-in-actor-actorof--distributed-fix
for getting the distributed stuff right.
This task was split up so they can be solved independently.
Leave a comment
Solving this problem locally is relatively simple. You can use some lock (e.g . a striped lock to prevent unwanted contention). But the problem is when there needs to be a new clustered instance. If you want to fix it in a cluster, you need to have some kind of clustered lock in place. Or you need to do a repair and discard the instance that was created, but never should have been created.
Defined in #1099
Updating tickets (#967, #974, #975, #976, #980, #981, #989, #990, #992, #993, #994, #999, #1000, #1004, #1008, #1011, #1015, #1018, #1022, #1023, #1024, #1025, #1027, #1028, #1029, #1030, #1032, #1033, #1036, #1047, #1053, #1062, #1067, #1068, #1069, #1072, #1075, #1078, #1082, #1102, #1107, #1110, #1111, #1115, #1116, #1121, #1122, #1123, #1124)