Internet Routing: What is the Problem and how do we evaluate it?
Most people seem to believe that Internet Routing is to some degree
broken. Still pinpointing what is proken seems to be rather difficult.
One key contributing characteristic of routing is that there are lots
of scenarios each with its own individual challenges. For example:
- All major ISPs have to determine their routing policies,
implement these in their operations environment, and then
realize them using the various components of their network and
management infrastructure, including routers and databases.
- Any multihomed customer has to worry about what is
necessary for him to be able to do multihoming
- An Internet researcher may not want to be bound by the
commercial relationships between the providers and may
therefore establishe an overlay network.
- ....
Internet routing is a global optimization problem within the control
plane of the Internet. So far it has been solved in a piecemeal
fashion. Folks are deriving solutions that are "optimial"/"workable"
for their specific situations. Nevertheless the current Internet
routing architecture works rather well. It is just questionable if it
is an optimal solution to the global optimization problem. But where
are the shortcommings of this solution and do they actually matter?
Judging from the interest of ISPs, Vendors, and researchers routing
related questions matter. On the other hand most users may very well
think that the problem does not matter. Most packets reach their
destination most of the time in a reasonable time. The complexity is
hidden from the user and most of the time the users are reasonable
happy.
But what happens if packets are not properly delievered. At this point
we realize among other problems that the Internet control plane is not
capable of debugging itself nor is it providing humans with sufficient
details for debugging it. This can only be changed with a measurement
infrastructure embedded within the routing architecture. Furthermore
some of the problems are due to human errors, bad configuration
practices, and no consistency checks within the control plane. These
are not just in need of local checks but global controls build into
the architecture.
But there are also more fundamental question regarding the Internet
routing architecture:
- Should routing be static or dynamic based on the amout of traffic?
- Should routing be hierarchical or support virtual overlays
- Should the routing be controled by the end-user or the ISP?
If there is more than one routing layer what are their interactions
and how do they interact with the user workload and the user
performance requirements?
This last question points out that one has to consider many sometimes
conflicting tradeoffs in order to design an alternative routing
strategy:
- local control vs. global fault analysis
(which information is providered, where is it processed, who can
access it, and who can use it?)
- fast updates vs. stability
(This question has two aspects:
- for static/dynamic routing: When is a link faulty
- for dynamic routing: When timescale is appropriate given a workload?
- how can we enable choices of algorithm and parameters while
maintaining simplicity?
- how does one find a tradeoff between performance/resilience/cost?
- how does one formulate policies and check them for validity and
against intruders against various data bases?
- simplicity vs. features
- end-user vs. operator control
In summary, I believe that we neiter understand the basic design space
in which to develop an Internet routing architecture nor the
evaluation criteria with which to judge our success or failure nor the
process of realizing the architecture in a controllable manner nor a
process for checking its operation.