August 2018

In Part 1 we have recalled how Bellman-Ford, a distance vector algorithm, works to find shortest paths in a directed, weighted network. Now we want to understand in which aspects Q-Routing^[1]Boyan, J. and Littman, M. (1994). Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach. modifies Bellman-Ford.

Difference 1: path relaxation steps performed asynchronously and online?

Difference 2: metric describing the path “quality”.

References[+]

References
↑1	Boyan, J. and Littman, M. (1994). Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach.

Month: August 2018

Q-Routing Revisited (Part 2)