system messages shall “eventually” be delivered
remote supervision means that if the Terminate() or ChildTerminated() messages are lost, something will get stuck. These messages must probably be retransmitted in that case.
Also: remove publication of dead system messages to deadLetters once we have that guarantee in place, since they do not belong into user event handlers.
[I didn’t find it in my heart to delete this gem, so I’ll just move it down a bit]
possible scheme:
- supervisor sends Terminate()
- supervisor does not get ChildTerminated() after X time => reparent the child to DavyJones
- DavyJones periodically sends Terminate() and subscribes to DeathWatch for all its “children” and releases them once he gets any sort of reply
- supervisor forwards all ChildTerminated() he gets for “unknown” children to DavyJones
Also: remove publication of dead system messages to deadLetters once we have that guarantee in place, since they do not belong into user event handlers.
[I didn’t find it in my heart to delete this gem, so I’ll just move it down a bit]
possible scheme:
- supervisor sends Terminate()
- supervisor does not get ChildTerminated() after X time => reparent the child to DavyJones
- DavyJones periodically sends Terminate() and subscribes to DeathWatch for all its “children” and releases them once he gets any sort of reply
- supervisor forwards all ChildTerminated() he gets for “unknown” children to DavyJones
Leave a comment
a solution which looks promising needs the new remoting:
- in the writer actor for each connection, wrap SystemMessage/Failed/Terminated inside sequenced envelopes which are retried/deduplicated
- unless I’m currently missing something, this should not observably violate ordering guarantees
- no holding of traffic necessary at any point, i.e. only the retry timer
So I implemented now a retransmission scheme (a pair of smart buffers) with sliding window and selective NACKs for efficiency. I stress-tested it, too.
These are only the building blocks, it is not hooked yet into the remoting. Also, it is currently somewhat ugly. Proceeding with cleanup and then integration with remoting.
These are only the building blocks, it is not hooked yet into the remoting. Also, it is currently somewhat ugly. Proceeding with cleanup and then integration with remoting.