Investigate biased gossip
To reduce number of network connections in a large cluster we should investigate if we can use biased gossip.
http://en.wikipedia.org/wiki/Gossip_protocol#Biased_gossip
I see no problem with it and it should be fairly simple to implement.
Hearbeats should follow the same paths. Hearbeating is slightly more difficult, because when rebalancing (changing buddies) you need to tell the monitor that you will stop heartbeating. Changing buddies for heartbeating shouldn't be done to often, because it resets the heartbeat history.
http://en.wikipedia.org/wiki/Gossip_protocol#Biased_gossip
I see no problem with it and it should be fairly simple to implement.
Hearbeats should follow the same paths. Hearbeating is slightly more difficult, because when rebalancing (changing buddies) you need to tell the monitor that you will stop heartbeating. Changing buddies for heartbeating shouldn't be done to often, because it resets the heartbeat history.
Leave a comment
on 2012-06-28 10:35 *
By Jonas Bonér
Ok. I'll read up on it.
on 2012-06-28 10:59 *
By Jonas Bonér
How would we automatically detect who is close and who is far away? Natural address ordering of the members in the node ring makes sure that nodes on the same machine are close but that's about it the next node in the ring can potentially be in another data center.
The only way (without some crazy attempt to base it dynamically on ping response time or so) is to allow the user to create node groups to tell us what his topology looks like.
This is something I have been thinking about before (think there is a ticket on that already).
The only way (without some crazy attempt to base it dynamically on ping response time or so) is to allow the user to create node groups to tell us what his topology looks like.
This is something I have been thinking about before (think there is a ticket on that already).
on 2012-06-28 11:00 *
By Jonas Bonér
on 2012-06-28 11:07 *
By Patrik Nordwall
We also have the configured seed nodes (deputy nodes) that should be in different datacenters.
on 2012-06-28 11:14 *
By Jonas Bonér
Regardless, I think this have very low prio since 99% of all users who run Akka cluster will run it:
1. with few nodes, < 10
2. in a single fast data center, probably even the same rack
We should let the usage and customers drive stuff like this.
1. with few nodes, < 10
2. in a single fast data center, probably even the same rack
We should let the usage and customers drive stuff like this.
on 2012-06-28 11:16 *
By Patrik Nordwall
I totally agree, and thats why used low prio.
on 2012-06-28 11:17 *
By Jonas Bonér
Good