Pattern: Singleton actor instance in cluster
Requested by several users in mailing list.
A service which gives a single point of responsibility for certain cluster-wide consistent decisions. With the leader logic we have the basic block onto which this can be built. In the short-term thing is to whip up a small pattern which can be used for this. An idea could be:
This boils down to adding communication and acknowledgement logic to turn the eventually consistent cluster events into globally consistent single-master management. Then users can do with that whatever they want, e.g. that child could broadcast its location or remote-deploy other things or whatever.
The full solution involves the complete implementation of all our ideas with virtual actor paths and the cluster partition mapping, naming service etc. That is obviously a major release on its own.
A service which gives a single point of responsibility for certain cluster-wide consistent decisions. With the leader logic we have the basic block onto which this can be built. In the short-term thing is to whip up a small pattern which can be used for this. An idea could be:
- user makes sure that each node has an instance of the Singleton actor running at a defined path, say “/user/Singleton”
- Singleton on the leader is in charge, e.g. creates a child actor of some user type
- upon leader change, the new leader requests hand-off from the old
- the old leader will relinquish responsibility (e.g. kill that child actor) and reply to the new leader
- new leader assumes responsibility, e.g. creates that child again
This boils down to adding communication and acknowledgement logic to turn the eventually consistent cluster events into globally consistent single-master management. Then users can do with that whatever they want, e.g. that child could broadcast its location or remote-deploy other things or whatever.
The full solution involves the complete implementation of all our ideas with virtual actor paths and the cluster partition mapping, naming service etc. That is obviously a major release on its own.
Leave a comment
on 2013-01-11 20:15 *
By Patrik Nordwall
yes, it will
on 2013-01-14 21:31 *
By Patrik Nordwall
Component changed from None to cluster
Status changed from New to Accepted
on 2013-01-19 01:52 *
By Patrik Nordwall
first cut: https://github.com/akka/akka/pull/1040
I took a look at the docs and some of the code last night. Looks great! I would like to confirm something - as long as the singleton actor is running on the former leader the new leader won't start its singleton actor? For example if the former leader's singleton actor is waiting for another actor to finish its task. Once the task is finished and the singleton actor stops itself then the new leader will start up its singleton actor?
So there will be in essence a concept of an application leader that is separate from the cluster leader. The application leadership won't be transferred until the former cluster leader's singleton actor is terminated. The singleton actor should refuse to accept new work while a handover is in process (in a well behaved application).
So there will be in essence a concept of an application leader that is separate from the cluster leader. The application leadership won't be transferred until the former cluster leader's singleton actor is terminated. The singleton actor should refuse to accept new work while a handover is in process (in a well behaved application).
on 2013-01-24 17:42 *
By Patrik Nordwall
That is correct. New singelton actor will not be started until the previous is terminated. In the simple case you use a PoisonPill as terminationMessage, and thereby the actor will not process anything after that. For more advanced shutdown scenarios you can use an application specific terminationMessage as I illustrated in the documentation sample. Then you can choose how to process, or not, the messages between the terminationMessage and the actual stop.
on 2013-01-25 19:25 *
By ngocdaothanh
When will the Akka version that includes this feature be released?
on 2013-01-25 19:29 *
By Patrik Nordwall
It will be backported to upcoming 2.1.1, which I think will go out pretty soon. I have no dates. It should be copy-paste portable with 2.1.0 if you need it earlier.
on 2013-01-25 20:45 *
By Patrik Nordwall
thanks for trying,
cluster.isTerminated can be changed to !cluster.isRunning
cluster.registerOnMemberUp line can be changed to self ! StartLeaderChangedBuffer
cluster.isTerminated can be changed to !cluster.isRunning
cluster.registerOnMemberUp line can be changed to self ! StartLeaderChangedBuffer
I made those changes and also copied over STMultiNodeSpec, but when I tried to run the ClusterSingletonManagerSpec (using sbt multi-jvm:test) against 2.1.0 I got a number of test failures.
The test output is here https://gist.github.com/4645505, and the Build.scala is at https://gist.github.com/4645523
I'm perfectly happy to wait for 2.1.1 but I wanted to let you know.
The test output is here https://gist.github.com/4645505, and the Build.scala is at https://gist.github.com/4645523
I'm perfectly happy to wait for 2.1.1 but I wanted to let you know.
on 2013-01-27 14:10 *
By Patrik Nordwall
Thanks for trying it. I don't want to spend time on it for 2.1.0, since it's backported to upcoming 2.1.1. Anyway, from the log it's clear what is going on, so here is a patch you could try if you like. It's because of the missing registerOnMemberUp feature.
From 77d997abb1f28897f927b45cec90a780a86917e9 Mon Sep 17 00:00:00 2001
From: Patrik Nordwall <patrik.nordwall@gmail.com>
Date: Sun, 27 Jan 2013 10:02:29 +0100
Subject: [PATCH] 2.1.0 patch
---
.../contrib/pattern/ClusterSingletonManager.scala | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/akka-contrib/src/main/scala/akka/contrib/pattern/ClusterSingletonManager.scala b/akka-contrib/src/main/scala/akka/contrib/pattern/ClusterSingletonManager.scala
index cc4b989..6a038e4 100644
--- a/akka-contrib/src/main/scala/akka/contrib/pattern/ClusterSingletonManager.scala
+++ b/akka-contrib/src/main/scala/akka/contrib/pattern/ClusterSingletonManager.scala
@@ -338,12 +338,13 @@ class ClusterSingletonManager(
// subscribe to cluster changes, re-subscribe when restart
cluster.subscribe(self, classOf[MemberDowned])
cluster.subscribe(self, classOf[MemberRemoved])
+ cluster.subscribe(self, classOf[MemberUp])
setTimer(CleanupTimer, Cleanup, 1.minute, repeat = true)
// defer subscription to LeaderChanged to avoid some jitter when
// starting/joining several nodes at the same time
- cluster.registerOnMemberUp(self ! StartLeaderChangedBuffer)
+ // cluster.registerOnMemberUp(self ! StartLeaderChangedBuffer)
}
override def postStop(): Unit = {
@@ -363,6 +364,10 @@ class ClusterSingletonManager(
startWith(Start, Uninitialized)
when(Start) {
+ case Event(MemberUp(m), _) if m.address == cluster.selfAddress ⇒
+ self ! StartLeaderChangedBuffer
+ stay
+
case Event(StartLeaderChangedBuffer, _) ⇒
leaderChangedBuffer = context.actorOf(Props[LeaderChangedBuffer].withDispatcher(context.props.dispatcher))
getNextLeaderChanged()
@@ -568,6 +573,8 @@ class ClusterSingletonManager(
case Event(Cleanup, _) ⇒
cleanupOverdueNotMemberAnyMore()
stay
+ case Event(MemberUp(_), _) ⇒
+ stay
}
onTransition {
--
1.7.6
on 2013-01-27 14:21 *
By Patrik Nordwall
My suggeted patch doesn't cover Up in initial CurrentClusterState. Better to wait for 2.1.1.
on 2013-01-28 00:19 *
By Patrik Nordwall
Here is a working port to 2.1.0: https://gist.github.com/4649875
Works! Thanks!
[JVM-Node2] Run completed in 53 seconds, 28 milliseconds.
[JVM-Node2] Total number of tests run: 8
[JVM-Node2] Suites: completed 1, aborted 0
[JVM-Node2] Tests: succeeded 8, failed 0, ignored 0, pending 0
[JVM-Node2] All tests passed.
[JVM-Node6] Run completed in 52 seconds, 850 milliseconds.
[JVM-Node6] Total number of tests run: 8
[JVM-Node6] Suites: completed 1, aborted 0
[JVM-Node6] Tests: succeeded 8, failed 0, ignored 0, pending 0
[JVM-Node6] All tests passed.
[JVM-Node1] Run completed in 53 seconds, 57 milliseconds.
[JVM-Node1] Total number of tests run: 8
[JVM-Node1] Suites: completed 1, aborted 0
[JVM-Node1] Tests: succeeded 8, failed 0, ignored 0, pending 0
[JVM-Node1] All tests passed.
[JVM-Node2] Run completed in 53 seconds, 28 milliseconds.
[JVM-Node2] Total number of tests run: 8
[JVM-Node2] Suites: completed 1, aborted 0
[JVM-Node2] Tests: succeeded 8, failed 0, ignored 0, pending 0
[JVM-Node2] All tests passed.
[JVM-Node6] Run completed in 52 seconds, 850 milliseconds.
[JVM-Node6] Total number of tests run: 8
[JVM-Node6] Suites: completed 1, aborted 0
[JVM-Node6] Tests: succeeded 8, failed 0, ignored 0, pending 0
[JVM-Node6] All tests passed.
[JVM-Node1] Run completed in 53 seconds, 57 milliseconds.
[JVM-Node1] Total number of tests run: 8
[JVM-Node1] Suites: completed 1, aborted 0
[JVM-Node1] Tests: succeeded 8, failed 0, ignored 0, pending 0
[JVM-Node1] All tests passed.
In the documentation there is this bit of code for getting the singleton actor:
case LeaderChanged(Some(leaderAddress)) ⇒
val path = RootActorPath(leaderAddress) / "user" / "singleton" / "consumer"
val consumer = system.actorFor(path)
But what if there is a delay in the handover? There could be a fair amount of time between the time the cluster leadership changes and the ClusterSingleton leadership changes. It might be useful to have another EventBus message that the ClusterSingletonManager publishes when the handover is complete.
case LeaderChanged(Some(leaderAddress)) ⇒
val path = RootActorPath(leaderAddress) / "user" / "singleton" / "consumer"
val consumer = system.actorFor(path)
But what if there is a delay in the handover? There could be a fair amount of time between the time the cluster leadership changes and the ClusterSingleton leadership changes. It might be useful to have another EventBus message that the ClusterSingletonManager publishes when the handover is complete.
on 2013-01-28 11:54 *
By Patrik Nordwall
I think that is out of scope for the ClusterSingletonManager, since it can be done better by the singleton actor. I think I wrote in the docs that it can publish its existence. Note that publishing to the event bus would only publish to the actors subscribing to the event bus on that local actor system.
I can clarify in the docs what LeaderChanged followed by actorFor means. It could actually be better to illustrate that the leaderAddress is stored away on LeaderChanged, for usage in later actorFor.
I can clarify in the docs what LeaderChanged followed by actorFor means. It could actually be better to illustrate that the leaderAddress is stored away on LeaderChanged, for usage in later actorFor.
Ok. With regards to the EventBus being local, I was thinking that either a) the ClusterSingletonManager would send a message to all of the other ClusterSingletonManagers that the handover was complete and then they all would publish to their local event buses, or b) it would somehow interact with the clustering system to have message published cluster-wide. But it's easy enough to implement (a) with my own actor running on each node.
on 2013-01-28 17:52 *
By Patrik Nordwall
good, thanks for the suggestion anyway
you can see the doc improvements here https://github.com/akka/akka/pull/1074
you can see the doc improvements here https://github.com/akka/akka/pull/1074