Mitigate blocking by name lookups when lookup fails
When the remoting is bombarded with failed name lookups, messages might be dropped or delayed by the blocking on name lookups. Quoting Patrik:
"I started with writing a test to ensure that things are not exhausted when using broken connections. Unfortunately we have some more work to do. Failing test here: https://github.com/akka/akka/commit/cc1f85b3ba5c8523265dda7c13775fdaad463fbd"
Possible workaround is to Gate addresses where lookup failed. This disables associations to the failed address until the configured time elapses. This does not solve the case however, when each of the failing lookup addresses are different (I don't know how realistic is that case).
Long term solution is to solve name lookups asynchronously, see: #2591
"I started with writing a test to ensure that things are not exhausted when using broken connections. Unfortunately we have some more work to do. Failing test here: https://github.com/akka/akka/commit/cc1f85b3ba5c8523265dda7c13775fdaad463fbd"
Possible workaround is to Gate addresses where lookup failed. This disables associations to the failed address until the configured time elapses. This does not solve the case however, when each of the failing lookup addresses are different (I don't know how realistic is that case).
Long term solution is to solve name lookups asynchronously, see: #2591
Leave a comment
on 2013-01-25 15:10 *
By Patrik Nordwall
What I think is the most important scenario to handle is that when many nodes in the cluster becomes unreachable it should not disturb communication with live nodes. Also when continuing sending to the dead nodes. Name lookups for reconnects could be a problem, but those are perhaps cached.