Create topology-aware routers
Similar to Voldermort's Zones: https://github.com/voldemort/voldemort/wiki/Topology-awareness-capability
Would allow spreading the risk by routing to different instances on different racks or datacenters.
Also see Cassandras RackAwareStrategy: http://wiki.apache.org/cassandra/Operations#Replication
Cassandra's config is like this, with 'Data Center:Rack' mapped to host:port:
Parsed by: http://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.6.1/contrib/property_snitch/src/java/org/apache/cassandra/locator/PropertyFileEndPointSnitch.java
But I think I prefer Voldemort's scheme. Let's think about what is most useful in our env and come up with our own.
Would allow spreading the risk by routing to different instances on different racks or datacenters.
Also see Cassandras RackAwareStrategy: http://wiki.apache.org/cassandra/Operations#Replication
Cassandra's config is like this, with 'Data Center:Rack' mapped to host:port:
#Cassandra Node IP:Port=Data Center:Rack 192.168.1.200\:7000=dc1:r1 192.168.2.300\:7000=dc2:rA
Parsed by: http://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.6.1/contrib/property_snitch/src/java/org/apache/cassandra/locator/PropertyFileEndPointSnitch.java
But I think I prefer Voldemort's scheme. Let's think about what is most useful in our env and come up with our own.
Leave a comment
on 2012-06-16 11:17 *
By Jonas Bonér
Description changed from Similar to Voldermort's Zon... to Similar to Voldermort's Zon...
on 2012-06-16 12:50 *
By Jonas Bonér
Description changed from Similar to Voldermort's Zon... to Similar to Voldermort's Zon...
on 2012-06-16 12:50 *
By Jonas Bonér
Description changed from Similar to Voldermort's Zon... to Similar to Voldermort's Zon...
on 2012-06-16 15:34 *
By Helena Edelson
(Comment removed)
I'd like to do this feature, if I can steal it from Patrick and work with him.
Our use case related to global network topology and availability zones (AZ). I've reviewed the linked resources above.
Our use case related to global network topology and availability zones (AZ). I've reviewed the linked resources above.
on 2013-07-14 18:20 *
By Helena Edelson
Started a google doc to outline an initial proposal - will share link with Jonas/Patrik/Roland/Viktor when general ideas from linked resources and some real world use cases are incorporated on Monday.
on 2013-07-14 19:26 *
By Helena Edelson
Jonas - here's the most recent parser version of the above snitch link:
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/PropertyFileSnitch.java
I recommend considering these conceptually well:
Cassandra's Ec2MultiRegionSnitch which Gossips EC2 Region
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
Voldemort's Zone Routing Strategy:
https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/routing/ZoneRoutingStrategy.java
Cassandra's Network topology snitch to route requests more efficiently, takes data center and rack into consideration
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/AbstractNetworkTopologySnitch.java
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/PropertyFileSnitch.java
I recommend considering these conceptually well:
Cassandra's Ec2MultiRegionSnitch which Gossips EC2 Region
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
Voldemort's Zone Routing Strategy:
https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/routing/ZoneRoutingStrategy.java
Cassandra's Network topology snitch to route requests more efficiently, takes data center and rack into consideration
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/AbstractNetworkTopologySnitch.java
on 2013-07-15 07:29 *
By Patrik Nordwall
Sounds good
on 2013-07-15 14:34 *
By Helena Edelson
Assigned to set to Helena Edelson
Status changed from New to Accepted
on 2013-07-16 20:27 *
By Helena Edelson
I can start this in 2 weeks, will that work?
on 2013-08-10 17:37 *
By Helena Edelson
I finally have a moment free to start this. Here is an initial framework concept stubbed out, minus any topology logic yet, just the general collection / gossip / delivery / router structures / selectors https://github.com/helena/akka/commit/f9fc258da3c221f0f161f6a79eff70aba2ba280d
Let me know if this is the right strategy for the routers WRT holding the logic necessary.
- Helena
Let me know if this is the right strategy for the routers WRT holding the logic necessary.
- Helena
on 2013-08-10 17:42 *
By Patrik Nordwall
Great, I will look at it next week
on 2013-08-13 09:22 *
By Patrik Nordwall
Wrote some comments/questions in https://github.com/helena/akka/commit/f9fc258da3c221f0f161f6a79eff70aba2ba280d
on 2013-08-13 11:45 *
By Helena Edelson
(Comment removed)
on 2013-08-29 19:27 *
By Helena Edelson
(Comment removed)
on 2013-08-30 16:03 *
By Helena Edelson
(Comment removed)
on 2013-09-06 00:33 *
By Helena Edelson
This is the initial concept describing the Network Topology API, which will be used by the TopologyAwareRouter to select nearest neighbors.
https://github.com/helena/akka/commit/45956d1d4737cc75ced6b9301d71a4ee4e5adc9d
https://github.com/helena/akka/commit/45956d1d4737cc75ced6b9301d71a4ee4e5adc9d
on 2013-09-09 10:44 *
By Patrik Nordwall
The two typical use cases for the router, stolen from Voldemort (https://github.com/voldemort/voldemort/wiki/Topology-awareness-capability):
1) "Now due to possible high latency between these zones we want the routing strategy in the client to optimize for the closest zones."
2) "Now since these zones may be geographically distributed we can expect network partitions to be very common. Hence we would like every store to have some amount of redundancy in every zone."
I looked at the new suggestion, and talked a bit with the other guys at the office.
I'm not sure what is the exact best way to define the things in the configuration.
I think we have seen two alternatives so far:
1) flat structure of zones with proximity-list
2) hierarchical structure, like EC2
I'm not sure which one is best. Perhaps the hierarchical is more convenient after all, but then it should not be fixed to 2 levels (regions, availability zones).
In addition to the host name (wildcard) matching we could later add matching of cluster node role, which makes it possible at startup of a node to "place" it in a zone without depending on a specific hostname schema.
This feature should be born in the contrib area, since we want to be able to evolve it more easily based on usage feedback.
http://doc.akka.io/docs/akka/2.2.1/contrib/index.html
1) "Now due to possible high latency between these zones we want the routing strategy in the client to optimize for the closest zones."
2) "Now since these zones may be geographically distributed we can expect network partitions to be very common. Hence we would like every store to have some amount of redundancy in every zone."
I looked at the new suggestion, and talked a bit with the other guys at the office.
I'm not sure what is the exact best way to define the things in the configuration.
I think we have seen two alternatives so far:
1) flat structure of zones with proximity-list
2) hierarchical structure, like EC2
I'm not sure which one is best. Perhaps the hierarchical is more convenient after all, but then it should not be fixed to 2 levels (regions, availability zones).
In addition to the host name (wildcard) matching we could later add matching of cluster node role, which makes it possible at startup of a node to "place" it in a zone without depending on a specific hostname schema.
This feature should be born in the contrib area, since we want to be able to evolve it more easily based on usage feedback.
http://doc.akka.io/docs/akka/2.2.1/contrib/index.html
on 2013-09-09 14:07 *
By Helena Edelson
Yes:
1) high latency between zones - awareness of nearest-neighbor (data center, availability zones..)
2) redundancy in every zone - this part is done in terms of failover: instances per data center, data centers per geographical region
What it defines now is:
per geographical partition: Region > DataCenters and their proximal DCs > for each DC their deployed instances (so that we can map this instance (cluster node) to its DataCenter - i added this yesterday but it is not ready to push
This also allows mapping only those instances per DataCenter that are up, and to apply the failure detector strategy to know when a data center goes down, and remove it from the proximal dcs for a given dc, and periodically check for when it comes back online.
That all said, this has to cover both cloud, rack, and neither of the two user types - i.e. the non-cloud singular location deployment in one data center.
The latter case means I could add a configuration to describe a flat topology, but even so that could fall into what we have thus far:
If just one datacenter, failovers are to deployed instances which we can map to in the cluster for those that have joined.
Distilled, this ultimately needs to be flexible but hierarchical IMO.
Optimally, it would be nice to offer a configurable strategy type - by data center or by rack, but i think the light partition wrapper around these of by region is important and will become more so over time.
This could simply be:
1) high latency between zones - awareness of nearest-neighbor (data center, availability zones..)
2) redundancy in every zone - this part is done in terms of failover: instances per data center, data centers per geographical region
What it defines now is:
per geographical partition: Region > DataCenters and their proximal DCs > for each DC their deployed instances (so that we can map this instance (cluster node) to its DataCenter - i added this yesterday but it is not ready to push
This also allows mapping only those instances per DataCenter that are up, and to apply the failure detector strategy to know when a data center goes down, and remove it from the proximal dcs for a given dc, and periodically check for when it comes back online.
That all said, this has to cover both cloud, rack, and neither of the two user types - i.e. the non-cloud singular location deployment in one data center.
The latter case means I could add a configuration to describe a flat topology, but even so that could fall into what we have thus far:
topology {
# One region (geographical partition) - all local to a NOC
paris-dc-1 {
zone-id = 0
# List these in order of proximity as this implicitly becomes their weight.
# defaults to proximal-to = []
proximal-to = [1]
instances = [....]
}
# Same location, multiple dcs
paris-dc-2 {
zone-id = 1
proximal-to = [0]
instances = [....]
}
}
If just one datacenter, failovers are to deployed instances which we can map to in the cluster for those that have joined.
Distilled, this ultimately needs to be flexible but hierarchical IMO.
Optimally, it would be nice to offer a configurable strategy type - by data center or by rack, but i think the light partition wrapper around these of by region is important and will become more so over time.
This could simply be:
config match {
case "rack" => RackStrategy
case "dataCenter" => DataCenterStrategy
}
on 2013-09-09 14:08 *
By Helena Edelson
I see that contrib has the dep for cluster, I will move it there.
on 2013-09-09 14:35 *
By Helena Edelson
I see common logic for node awareness between the new ClusterNetworkTopology and the existing ClusterMetricsCollector actors.
I'm going to push to github and idea that describes a new ClusterMemberAware base trait for actors to mixin, where the watching of the Up/Downed etc node logic would reside - what is currently in the ClusterMetricsCollector, so that any can leverage that without duplication.
The ClusterNetworkTopology would use this to track nodes that are up in its DC and pass these to the topology-aware router.
It would provide a flexible topology vs a static one.
Mapping to node roles is actually almost done in my branch.
- Helena
I'm going to push to github and idea that describes a new ClusterMemberAware base trait for actors to mixin, where the watching of the Up/Downed etc node logic would reside - what is currently in the ClusterMetricsCollector, so that any can leverage that without duplication.
The ClusterNetworkTopology would use this to track nodes that are up in its DC and pass these to the topology-aware router.
It would provide a flexible topology vs a static one.
Mapping to node roles is actually almost done in my branch.
- Helena
on 2013-09-10 15:50 *
By Patrik Nordwall
Sounds like you have a good plan.
on 2013-12-10 16:40 *
By Patrik Nordwall
moving back to middle column due to inactivity