Pattern: Singleton actor instance in cluster

Requested by several users in mailing list.

A service which gives a single point of responsibility for certain cluster-wide consistent decisions. With the leader logic we have the basic block onto which this can be built. In the short-term thing is to whip up a small pattern which can be used for this. An idea could be:

user makes sure that each node has an instance of the Singleton actor running at a defined path, say “/user/Singleton”
Singleton on the leader is in charge, e.g. creates a child actor of some user type
upon leader change, the new leader requests hand-off from the old
the old leader will relinquish responsibility (e.g. kill that child actor) and reply to the new leader
new leader assumes responsibility, e.g. creates that child again

This boils down to adding communication and acknowledgement logic to turn the eventually consistent cluster events into globally consistent single-master management. Then users can do with that whatever they want, e.g. that child could broadcast its location or remote-deploy other things or whatever.

The full solution involves the complete implementation of all our ideas with virtual actor paths and the cluster partition mapping, naming service etc. That is obviously a major release on its own.

Leave a comment

on 2013-01-10 14:26 *

By Patrik Nordwall

Milestone set to Coltrane

on 2013-01-10 14:30 *

By Patrik Nordwall

Assigned to set to Patrik Nordwall

on 2013-01-11 20:11 *

By ericacm

Will this handle the scenario where the leader dies (JVM crash, etc)? /user/Singleton leadership should be assumed by another node in that case.

on 2013-01-11 20:15 *

By Patrik Nordwall

yes, it will

on 2013-01-14 21:30 *

By Patrik Nordwall

Milestone changed from Coltrane to Current

on 2013-01-14 21:31 *

By Patrik Nordwall

Component changed from None to cluster

Status changed from New to Accepted

on 2013-01-19 01:52 *

By Patrik Nordwall

first cut: https://github.com/akka/akka/pull/1040

on 2013-01-24 00:00 *

By rkuhn

Estimate changed from Small to Large

Sum of child estimates changed from 1.0 to 7.0

on 2013-01-24 00:44 *

By Patrik Nordwall

Status changed from Accepted to Test

on 2013-01-24 17:35 *

By ericacm

I took a look at the docs and some of the code last night. Looks great! I would like to confirm something - as long as the singleton actor is running on the former leader the new leader won't start its singleton actor? For example if the former leader's singleton actor is waiting for another actor to finish its task. Once the task is finished and the singleton actor stops itself then the new leader will start up its singleton actor?

So there will be in essence a concept of an application leader that is separate from the cluster leader. The application leadership won't be transferred until the former cluster leader's singleton actor is terminated. The singleton actor should refuse to accept new work while a handover is in process (in a well behaved application).

on 2013-01-24 17:42 *

By Patrik Nordwall

That is correct. New singelton actor will not be started until the previous is terminated. In the simple case you use a PoisonPill as terminationMessage, and thereby the actor will not process anything after that. For more advanced shutdown scenarios you can use an application specific terminationMessage as I illustrated in the documentation sample. Then you can choose how to process, or not, the messages between the terminationMessage and the actual stop.

on 2013-01-25 18:04 *

By Patrik Nordwall

Status changed from Test to Fixed

on 2013-01-25 19:25 *

By ngocdaothanh

When will the Akka version that includes this feature be released?

on 2013-01-25 19:29 *

By Patrik Nordwall

It will be backported to upcoming 2.1.1, which I think will go out pretty soon. I have no dates. It should be copy-paste portable with 2.1.0 if you need it earlier.

on 2013-01-25 20:28 *

By ericacm

Hi Patrick:

There are a couple places where it's not compatible with 2.1.0:

cluster.isTerminated (ClusterSingletonManager line 336)
cluster.registerOnMemberUp (ClusterSingletonManager line 346)

on 2013-01-25 20:45 *

By Patrik Nordwall

thanks for trying,

cluster.isTerminated can be changed to !cluster.isRunning

cluster.registerOnMemberUp line can be changed to self ! StartLeaderChangedBuffer

on 2013-01-27 05:44 *

By ericacm

I made those changes and also copied over STMultiNodeSpec, but when I tried to run the ClusterSingletonManagerSpec (using sbt multi-jvm:test) against 2.1.0 I got a number of test failures.

The test output is here https://gist.github.com/4645505, and the Build.scala is at https://gist.github.com/4645523

I'm perfectly happy to wait for 2.1.1 but I wanted to let you know.

on 2013-01-27 14:10 *

By Patrik Nordwall

Thanks for trying it. I don't want to spend time on it for 2.1.0, since it's backported to upcoming 2.1.1. Anyway, from the log it's clear what is going on, so here is a patch you could try if you like. It's because of the missing registerOnMemberUp feature.

From 77d997abb1f28897f927b45cec90a780a86917e9 Mon Sep 17 00:00:00 2001
From: Patrik Nordwall <patrik.nordwall@gmail.com>
Date: Sun, 27 Jan 2013 10:02:29 +0100
Subject: [PATCH] 2.1.0 patch

---
 .../contrib/pattern/ClusterSingletonManager.scala  |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/akka-contrib/src/main/scala/akka/contrib/pattern/ClusterSingletonManager.scala b/akka-contrib/src/main/scala/akka/contrib/pattern/ClusterSingletonManager.scala
index cc4b989..6a038e4 100644
--- a/akka-contrib/src/main/scala/akka/contrib/pattern/ClusterSingletonManager.scala
+++ b/akka-contrib/src/main/scala/akka/contrib/pattern/ClusterSingletonManager.scala
@@ -338,12 +338,13 @@ class ClusterSingletonManager(
     // subscribe to cluster changes, re-subscribe when restart
     cluster.subscribe(self, classOf[MemberDowned])
     cluster.subscribe(self, classOf[MemberRemoved])
+    cluster.subscribe(self, classOf[MemberUp])
 
     setTimer(CleanupTimer, Cleanup, 1.minute, repeat = true)
 
     // defer subscription to LeaderChanged to avoid some jitter when
     // starting/joining several nodes at the same time
-    cluster.registerOnMemberUp(self ! StartLeaderChangedBuffer)
+    //    cluster.registerOnMemberUp(self ! StartLeaderChangedBuffer)
   }
 
   override def postStop(): Unit = {
@@ -363,6 +364,10 @@ class ClusterSingletonManager(
   startWith(Start, Uninitialized)
 
   when(Start) {
+    case Event(MemberUp(m), _) if m.address == cluster.selfAddress ⇒
+      self ! StartLeaderChangedBuffer
+      stay
+
     case Event(StartLeaderChangedBuffer, _) ⇒
       leaderChangedBuffer = context.actorOf(Props[LeaderChangedBuffer].withDispatcher(context.props.dispatcher))
       getNextLeaderChanged()
@@ -568,6 +573,8 @@ class ClusterSingletonManager(
     case Event(Cleanup, _) ⇒
       cleanupOverdueNotMemberAnyMore()
       stay
+    case Event(MemberUp(_), _) ⇒
+      stay
   }
 
   onTransition {
-- 
1.7.6

on 2013-01-27 14:21 *

By Patrik Nordwall

My suggeted patch doesn't cover Up in initial CurrentClusterState. Better to wait for 2.1.1.

on 2013-01-28 00:19 *

By Patrik Nordwall

Here is a working port to 2.1.0: https://gist.github.com/4649875

on 2013-01-28 01:26 *

By ericacm

Works! Thanks!

[JVM-Node2] Run completed in 53 seconds, 28 milliseconds.
[JVM-Node2] Total number of tests run: 8
[JVM-Node2] Suites: completed 1, aborted 0
[JVM-Node2] Tests: succeeded 8, failed 0, ignored 0, pending 0
[JVM-Node2] All tests passed.
[JVM-Node6] Run completed in 52 seconds, 850 milliseconds.
[JVM-Node6] Total number of tests run: 8
[JVM-Node6] Suites: completed 1, aborted 0
[JVM-Node6] Tests: succeeded 8, failed 0, ignored 0, pending 0
[JVM-Node6] All tests passed.
[JVM-Node1] Run completed in 53 seconds, 57 milliseconds.
[JVM-Node1] Total number of tests run: 8
[JVM-Node1] Suites: completed 1, aborted 0
[JVM-Node1] Tests: succeeded 8, failed 0, ignored 0, pending 0
[JVM-Node1] All tests passed.

on 2013-01-28 08:25 *

By ericacm

In the documentation there is this bit of code for getting the singleton actor:

case LeaderChanged(Some(leaderAddress)) ⇒
val path = RootActorPath(leaderAddress) / "user" / "singleton" / "consumer"
val consumer = system.actorFor(path)

But what if there is a delay in the handover? There could be a fair amount of time between the time the cluster leadership changes and the ClusterSingleton leadership changes. It might be useful to have another EventBus message that the ClusterSingletonManager publishes when the handover is complete.

on 2013-01-28 11:54 *

By Patrik Nordwall

I think that is out of scope for the ClusterSingletonManager, since it can be done better by the singleton actor. I think I wrote in the docs that it can publish its existence. Note that publishing to the event bus would only publish to the actors subscribing to the event bus on that local actor system.

I can clarify in the docs what LeaderChanged followed by actorFor means. It could actually be better to illustrate that the leaderAddress is stored away on LeaderChanged, for usage in later actorFor.

on 2013-01-28 17:21 *

By ericacm

Ok. With regards to the EventBus being local, I was thinking that either a) the ClusterSingletonManager would send a message to all of the other ClusterSingletonManagers that the handover was complete and then they all would publish to their local event buses, or b) it would somehow interact with the clustering system to have message published cluster-wide. But it's easy enough to implement (a) with my own actor running on each node.

on 2013-01-28 17:52 *

By Patrik Nordwall

good, thanks for the suggestion anyway
you can see the doc improvements here https://github.com/akka/akka/pull/1074

on 2013-01-29 15:38 *

By Patrik Nordwall

Milestone changed from Current to 2.1.x

Drop the files anywhere in this page to upload them as attachments.

Related Tickets

Add people from your team or external to follow ticket activity

Followers will receive email updates about new ticket activity or emails sent to akka+2895@tickets.assembla.com