Link-state routing convergence and stability: is there a trade off?
Cengiz Alaettinoglu
New service sensitive applications require increasing level of network
availability. Current IGP restoration times are in seconds, much better
than 10s of seconds a few years ago. However, this is still not acceptable
for many service sensitive applications such as VoIP or online gaming.
In theory, link state routing restoration times can be as fast as a single
SPF computation time (100s of microseconds to few milliseconds) plus some
scheduling delay. However, such an implementation may not be
practical. Instead, implementations which achieve restoration within
propagation delay time frames (10s to few 100s of milliseconds) are within
reach today.
Why is it then the current IGP deployments can not achieve such
convergence times? Because, and for very good reasons, there is a
misconception of a trade off between IGP convergence times and
stability. In order to ensure stability, there are timers that limit the
effect of external instability to the system. Definitely these timers are
on the way of fast convergence. However, while trying to tune down these
timers to achieve fast convergence in the past, several ISPs have
experienced network wide melt downs.
If so, why is this trade off a misconception? Because, it is not a trade
off between convergence and stability in general, it only exists for the
current IGP implementations. It is possible to avoid instability by
slowing down the convergence only during link recovery. Further protection
can also be provided by damping the spf process.
Vendors attempted to implement such protection by implementing adaptive
timers that limit how often the spf process can be run. However, since
these algorithms were implemented without having realistic IGP
measurements, for the IGP deployments we studied, they always delayed the
routing convergence.
Thus, what is needed to achieve fast convergence without sacrificing
stability is good damping algorithms which can separate unstable
components from the stable components and tune themselves to the
conditions of the network. This can only be done with careful measurement
and analysis of IGP routing protocols. What is harder to come is to win
back trust of ISPs once such algorithms have been implemented.