investigate load-balancing routers based on latency or backlog size

from a blog post comment on http://letitcrash.com/post/56958418119/2-2-spotlight-adaptive-load-balancing-based-on-cluster:

Would latency, or backlog size be better default metrics for adaptive routing? If the heap is large, or the CPU usage is high you might assume that the node is slow, but surely it's better to just measure if the node is slow?

Leave a comment

on 2013-08-05 10:17 *

By rkuhn

One option would be to base it on the latency distribution of the heartbeat (or the reliability of its reception), but that does only measure aggregate values.

Another option would be to use actual ACKing to measure the latency or backlog of each routee and then favor the faster ones; using the backlog would then automatically disfavor the unreliable ones because they only get a new message when the other ones are under load (if the request or reply was lost).