Our Routing Problems Have Not Yet Begun
Bruce M. Maggs
Computer Science Department
Carnegie Mellon University
and
Akamai Technologies
Abstract
The current routing protocols have many well-known deficiencies. Yet the Internet as a
whole has proven to be remarkably stable, and core capacity has been scaling with demand,
indeed has perhaps outpaced demand, so that end users are seeing better performance today
than ever. This paper argues that because of this spare capacity, the consequences of the
flaws in the protocols have not yet been truly experienced. It then argues that short-term
reversals in the ratio of capacity to demand are plausible, and that these reversals might
engender serious routing problems.
Successes and failures of BGP
The most commonly cited problems with the routing protocols involve BGP. (See [HM00] for a
thorough introduction.) BGP, which governs the routes taken by datagrams that travel
between different autonomous systems provides no effective mechanisms for guaranteeing
quality of service or optimizing performance (in terms of latency and throughput). Support
for load-balancing, adopting to rapid changes in traffic patterns, and filtering malicious
traffic range from minimal to none. Furthermore, in practice, routing policies may be
influenced by financial considerations, and the manual entry of router configuration data
is common.
Perhaps it is surprising, then, that there have been only a few isolated incidents in which
major network outages have occurred. Human configuration has been to blame for several.
For example, in the summer of 2000, a BGP configuration error led routers at a Level3
facility in San Diego to advertise short routes to the rest of the Internet, temporarily
diverting an unsupportable traffic load to this facility. Later, in June 2001, Cable and
Wireless intentionally and abruptly refused to peer with PSINet (for financial reasons),
isolating many users on PSINet. (To those who advocate fully automatic configuration,
however, it would be wise to remember the adage, "To err is human, but to really foul
things up requires a computer." [E78])
But the success stories outweigh these incidents. Although it is early to assess the impact
of the recent large-scale power outage in the Eastern portion of the United States, there
are few initial reports of core network outages or even web-site infrastructure outages. A
National Research Council report on the impact of the September 11, 2001, attacks in the
United States [NRC02] also showed that the routing protocols adjusted properly to the
physical destruction of network infrastructure in New York City, and the Internet as a
whole continued to perform well (although certain news-oriented web sites were unable to
satisfy demand.) Addressing BGP more specifically, although certain worms such as "Code
Red" and "Slammer" (or "Sapphire") have generated enough malicious network traffic to
distract routers from their primary functions and disrupt BGP TCP connections (forcing
table retransmissions and resulting in "churn") [C03b,M03], none of these worms have caused
widespread route instability. Perhaps most interestingly, many BGP routers throughout the
world were patched and restarted one night in the spring of 2002, after the release of a
Cisco security patch, and yet network routing was not disrupted.
Operating in the dark
Before arguing that circumstances may soon arise in which the weaknesses of BGP may begin
to have more serious consequences, however, it is important to first observe that our
ability to predict the future behavior of network traffic and routing behavior is limited.
Indeed, it may be fair to say that we cannot even accurately characterize the behavior of
network traffic today, except to say that it has been known to change rapidly and
drastically. Examples include the rapid rise in http traffic after the introduction of the
Mosaic and Netscape browsers, the recent boom in "SPAM" email, the explosion of
file-sharing services, and the heavy traffic loads generated by worms and viruses and the
corresponding patches. (More than half of the Akamai's top-ten customers, ranked by total
number of bytes served, use Akamai to deliver virus and worm signatures and software
updates.) This is not to say that there have not been effective approaches to
understanding network traffic. The study by Saroiu et al. [SGGL02], for example, paints a
detailed picture of the types of flows entering and exiting the University of Washington's
network, and points out recent growth in traffic attributed to file-sharing services. But
this study may not be representative of Internet traffic at large. For example, it fails
to capture VPN traffic between enterprise office facilities (and many other sorts of
traffic). BGP behavior has also been studied extensively. As a well-done representative
of this type of work, Maennel et al. [MF02] study BGP routing instabilities.
But it would be difficult to find consensus among networking experts on answers to the
following sorts of questions.
1. Where (if anywhere) is the congestion in the Internet?
2. How much capacity does the Internet have, and how fast is it growing?
3. How much traffic does the core of the Internet carry today, and what does it look like?
4. How fast is network traffic growing?
5. What will traffic patterns look like five years from now?
6. Can we scale the network to support the demands of users five years from now?
7. How much does it and will it cost to increase network capacity?
8. Will stub networks soon be employing sophisticated traffic engineering mechanisms on
their own, e.g., those based on multihoming and overlay routing? What impact might these
techniques have?
9. What about content delivery networks? What fraction of the traffic are they carrying?
What is the impact of the trick of using DNS to route traffic?
These questions have, of course, been studied. Regarding the first question, the
"conventional wisdom" has been that congestion occurs primarily in "last mile" connections
to homes and enterprises. Cheriton [C03] and others have argued that the abundance of
"dark fiber" in the United States will provide enough transmission capacity for some time
to come. A recent study by Akella, et al. [ASS03], however, found that up to 40% of the
paths between all pairs of a diverse set of hosts on the Internet had at most 50Mbps spare
capacity. These "bottlenecks" were most commonly seen on tier-two, -three, and -four
networks, but 15% appeared on tier-one networks. The study indicates that regardless of
fiber capacity, there is already congestion in the core. Perhaps router capacity is a more
limited resource.
The second question has been addressed by Danzig who has periodically estimated network
capacity an traffic load. His estimates for of cross-continental capacity are surprisingly
low.
The coming crises?
Despite the caveats about our understanding of the state of the network today, let us make
the assumptions that the core of the Internet is, in many places, running at close to
capacity, and (more easily supported) that the last-mile remains a bottleneck for many end
users. How might a routing crisis ensue? Suppose that there is a rapid increase (perhaps
two orders of magnitude) in the traffic generated by end users. Such a scenario would be
driven by end user demand and greatly improved last-mile connectivity. As we have argued,
new applications (the web, file-sharing services, etc.) have in the past periodically
created large new traffic demands. Furthermore, these demands have arisen without abrupt
technology changes. What the new applications might be is difficult to predict. There are
many possible applications that could utilize high-quality video, but we have yet to see
enough last-mile connectivity to support them. In South Korea, where the penetration of
"broadband" to the home is more widespread than in the United States, networking gaming
applications have become a significant driver of traffic. Whatever the source, it is seems
plausible that great increases in demand will continue to punctuate the future. On the
last-mile front, upgrading capacity is likely to prove expensive, but is certainly
technically feasible. Let us assume there is great demand from end users for improved
connectivity (two orders of magnitude), and that end users are willing to pay for this
access connectivity into their homes and businesses. Increasing end-user bursting capacity
will increase the potential for drastic changes in traffic patterns.
If the following scenario should take place, carriers will be faced with task of scaling
their networks, requiring increases in both transmission capacity and switching capacity.
Predominant traffic patterns may also shift, requiring capacity in new places. The
carriers will presumably price data services to cover the expense of this new
infrastructure, and will make an effort to match increases with traffic demands to
increases in capacity.
So what might go wrong? As the carriers attempt to increase capacity, they will (as they
have in the past) try to avoid building-in excessive margins of spare capacity. But
predictions about where capacity is needed, and how much, may prove difficult. There are
many unknown variables, and they have the potential to swing rapidly. How quickly will
traffic demand grow? How will traffic patterns change? Will new applications behave
responsibly? How will the ratio of capacity-and-demand-at-the-edge to
capacity-required-in-the-core change? How much will it cost to increase capacity in the
core? As our scenario unfolds, let us assume that, due to the difficulties in predicting
these variables, occasionally growth-in-demand versus growth-in-core-capacity become
out-of-kilter, so that demand bumps up against capacity, and large parts of the core of the
Internet operate for weeks or perhaps months at a time at or near to capacity.
Now the routing problems set in
Imagine the problems a largely saturated core would cause. BGP provides no mechanism for
routing around congestion. Networks might find themselves effectively isolated from each
other, even if, through proper load balancing, congestion-free routes are available.
High-priority traffic would fare no better. BGP itself might have difficulty functioning.
Manual attempts to reduce congestion through BGP configuration BGP would increase the risk
of routing outages.
Directions for future research
The above discussion suggests a number of directions for future research. To ward off
problems in the short-to-medium term, we should further improve our understanding of how
the Internet currently operates so that we can make better short-term predictions. We
should analyze the behavior of the Internet with a saturated core, and determine what can
be done using the current protocols and practices to alleviate the problems that would
arise. Longer term, we need to replace BGP and most likely the interior protocols as well,
and consider modifying the Internet architecture too (as suggested by Zhang et al. [MZ03],
and surely many others). Of course replacing a universally adopted protocol like BGP is no
easy task, but it seems risky to continue with a protocol that is not designed to perform
well in extreme situations. Performance optimizations must be integral to such a protocol.
It is difficult to design, tune, or improve protocols or build networks, however, without a
good understanding of how networks operate in practice. Hence measurability should be a
goal as well (as suggested by Varghese and others). Most importantly, we should decide how
we want the Internet behave to behave in the future, and build accordingly.
References
[ASS03] A. Akella, S. Seshan, and A. Shaikh, An Empirical Evaluation of Wide-Area Internet
Bottlenecks, in Proceedings of the First ACM Internet Measurement Conference, October 2003,
to appear.
[C03] D. Cheriton, The Future of the Internet: Why it Matters, Keynote Address (SIGCOMM
2003 Award Winner), SIGCOMM 2003 Conference on Applications, Technologies, Architectures
and Protocols for Computer Communication, September, 2003.
[C03b] G. Cybenko, Presentation at DARPA Dynamic Quarantine Industry Day, March 2003.
[HM97] S. Halabi and D. McPherson, Internet Routing Architectures, second edition, Cisco Press, 2000.
[M03] B. M. Maggs, Presentation at DARPA Dynamic Quarantine Industry Day, March 2003.
[NRC01] The Internet Under Crisis Conditions: Learning from September 11, National Research
Council, Washington, DC, December 2001.
[SGGL02], S. Saroiu, K. Gummadi, S. D. Gribble and H. M. Levy, An Analysis of Internet
Content Delivery Systems, in Proceedings of the Fifth Symposium on Operating Systems Design
and Implementation, December, 2002.
[MF02] O. Maennel and A. Feldmann, Realistic BGP Traffic for Test Labs, in Proceedings of
the 2003 SIGCOMM Conference on Communications Architectures and Protocols, August, 2002.
[E78] Paul Erlich, Farmers Almanac, 1974.
slides position statement
slides panel discussion
Does the Complexity of the Internet Routing System Matter?
(and if so, why)
----------------------------------------------------------
The advent of the MP_REACH_NLRI and MP_UNREACH_NLRI
attributes, combined with the resulting generalization to
the BGP framework (i.e., consider the use of extended
communities [EXTCOMM] to provide route distinguishers
and/or route targets [RFC2547BIS]) have created the
opportunity to use BGP to transport a wide variety of
features and their associated signaling (the combination
of a BGP feature and its associated signaling is
sometimes called an "application"). Examples include flow
specification rules [FLOW], auto-discovery mechanisms for
Layer 3 VPNs [BGPVPN], and virtual private LAN services
[VPLS]. However, the use of the BGP as a generalized
feature transport infrastructure has generated a great
deal of discussion in the IETF community [IETFOL].
This debate has focused on the potential trade-offs
between the stability and scalability of the Internet
routing system, and the desire on the part of service
providers to rapidly deploy new services such as IP VPNs
[RFC2547BIS]. The debate has recently intensified due to
the emergence of a new class of services that use the BGP
infrastructure to distribute what may be considered
"non-routing information". Examples of such services
include the use of the BGP infrastructure as a
auto-discovery mechanisms for Layer 3 VPNs [BGPVPN] and
the virtual private LAN services mentioned above.
The problem, then, can be framed in terms of how we think
about the deployed BGP infrastructure. In particular, the
various positions can be summarized as follows:
o BGP is a General Purpose Transport Infrastructure
The General Purpose Transport Infrastructure position
asserts that BGP is a general purpose feature and
signaling transport infrastructure, and that new
services can be thought of as applications built on
this generic transport. Proponents of this position see
the issue as not whether the attributes (features and
signaling) that need to be distributed are part of some
particular class (routing, in this case), but rather
whether the requirements for the distribution of these
attributes are similar enough to the requirements for
the distribution of inter-domain routing
information. Hence, BGP is a logical candidate for such
a transport infrastructure, not because of the
("non-routing") information distributed, but rather due
to the similarity in the transport requirements. There
are other operational considerations that make BGP a
logical candidate, including its close to ubiquitous
deployment in the Internet (as well as in intranets),
its policy capabilities, and operator comfort levels
with the technology.
o BGP is a Special Purpose Transport Infrastructure
The proponents of the other position, namely, that the
BGP infrastructure was designed specifically and
implemented to transport "routing information", are
concerned that the addition of various other
non-routing applications to BGP will destabilize the
global routing system. The argument here is two-fold:
First, there is the concern that the plethora of new
features being added to BGP will cause software quality
degrade, hence destabilizing the global routing
system. This position is based upon well understood
software engineering principles, and is strengthened
long-standing experience that there is a direct
correlation between software features and bugs
[MULLER1999]. This concern is augmented by the fact
that in many cases, the existence of the code for these
features, even if unused, can also cause
destabilization in the routing system, since in many
cases these bugs cannot be isolated.
A second concern is based on complexity arguments,
notably that the increase in complexity of BGP and the
types of data that it carries will inherently
destabilize the global routing system. This is based on
several different lines of reasoning, including the
Simplicity Principle [RFC3439], and the concern that
the interaction of the dynamics and deployment
practices surrounding the simplest form of BGP, IPv4
BGP, is poorly understood. Finally, a related concern
is that the addition of these non-routing data types
will effect convergence and other scaling properties of
the global routing system.
The question is, then, what is the effect on the global
routing system of using the BGP distribution protocol to
transport arbitrary data types, versus the effect in
terms of the additional cost (e.g., in protocol
development, code, and operational expense) associated
with not utilizing the mechanisms already present in BGP?
More importantly, does it matter, and if so, why?
[BGPVPN] Ould-Brahim, H., E. Rosen, and Y. Rekhter, "Using
BGP as an Auto-Discovery Mechanism for
Provider-provisioned VPNs",
draft-ietf-l3vpn-bgpvpn-auto-00.txt, July,
2003. Work in Progress.
[EXTCOMM] Sangali, S., D. Tappan, and Y. Rekhter, "BGP
Extended Communities Attribute",
draft-ietf-idr-bgp-ext-communities-06.txt. Work
in Progress.
[FLOW] Marques, P, et. al., "Dissemination of flow
specification rules",
draft-marques-idr-flow-spec-00.txt, June,
2003. Work in Progress.
[IETFOL] https://www1.ietf.org/mailman/listinfo/routing-discussion
[MULLER1999] Muller, R. et. al., "Control System Reliability
Requires Careful Software Installation
Procedures", International Conference on
Accelerator and Largeand Large Experimental
Physics Systems, 1999, Trieste, Italy.
[RFC2547BIS] Rosen, E., et. al., "BGP/MPLS IP VPNs",
draft-ietf-l3vpn-rfc2547bis-00.txt, May, 2003,
Work in Progress.
[RFC3439] Bush, R. and D. Meyer, "Some Internet
Architectural Guidelines and Philosophy", RFC
3439, December, 2002.
Routing Problems are Too Easy to Cause, and Too Hard to Diagnose
================================================================
IP routing protocols, such as OSPF or BGP, form a complex,
highly-configurable distributed system underlying the end-to-end
delivery of data packets. "Highly configurable" is a nice way of
saying "hard to configure" or "easy to misconfigure," and "distributed
system" is a nice way of saying "hard to understand" or "hard to
debug." As such, we have a routing system today where a single
typographical error by a human operator can easily disconnect parts of
the Internet, and diagnosing and fixing routing problems remains an
elusive black art. This is unacceptable for any technology that would
be considered a core communication infrastructure. I believe that
the networking research community should devote significant attention
to improving the state of the art in router configuration and network
troubleshooting.
Several factors conspire to make IP router configuration extremely challenging
- Vendor configuration languages are primitive and low-level, like
assembly language (e.g., a typical router may have ten thousand lines
of configuration commands)
- Routers implement numerous complex protocols (e.g., static routes,
RIP, EIGRP, IS-IS, OSPF, BGP, MPLS, and various multicast protocols)
that have many tunable parameters (e.g., timers, link weights/areas,
and BGP routing policies)
- The routing protocols interact with each other (e.g., "hot-potato"
routing in BGP based on the underlying IGP, use of static routes to
reach the remote BGP end-point, and route injection between protocols)
- Scalability often requires even more complex configuration to limit
the scope of routing information (e.g., OSPF areas and summarization,
BGP route reflectors and confederations, and route aggregation)
- Networks are configured at the element (or router) level, rather than
as a single cohesive unit with well-defined policies and constraints
- Key network operations goals, such as traffic engineering and
security, are not directly supported, requiring operators to tweak the
router configuration in the hope of having the right (indirect) effect
on the network and its traffic
Addressing these complicated problems will require research work in
configuration languages, protocol modeling, and network modeling, and
would hopefully lead to a higher level of abstraction for managing the
configuration of the network as well as tools for configuration
checking and, better yet, automation of configuration from a
higher-level specification of the network goals. Extensions (or
replacements!) of the routing protocols may also be necessary to
rectify some of these problems.
Detecting, diagnosing, and fixing routing problems are also very
complicated because:
- Routing protocols are hard to configure, making configuration
mistakes very common (see above!)
- Routing protocols do not convey enough information to explain why a
route has changed (or disappeared entirely)
- No authoritative record exists that can identify which routes are
valid (e.g., whether the originating AS is entitled to advertise the
prefix, or whether one AS should be providing transit service from one
AS to another)
- Failures, configuration errors, or malicious acts in remote
locations can affect the path between two hosts
- Reachability problems can arise for other reasons, unrelated to the
routing protocols (e.g., packet filtering or firewalls, MTU mismatches,
network congestion, and overloaded or faulty end hosts)
- The end-to-end forwarding path depends on the complex interaction between
multiple routing protocols running in a large collection of networks
- Route filtering and route aggregation (often necessary for scalability) can
lead to subtle reachability problems, including persistent forwarding loops
- The network does not have much support for active measurement tools
for measuring the forwarding path (i.e., traceroute is very primitive,
and limited in its accuracy and potential uses)
- The Internet topology is not fully known, at the router or the AS
levels (or in terms of AS relationships and policies), and may be
inherently unknowable
Like router configuration, network troubleshooting has received little
attention from the research community, despite its importance to
network practitioners. Research work in network support for
measurement, extensions to routing protocols to facilitate diagnosis,
and new diagnostic tools would be extremely valuable for improving the
state of the art.
slides position statement
slides panel discussion
Position statement for WIRED
Z. Morley Mao
UC Berkeley
zmao@eecs.berkeley.edu
How to debug the routing system?
================================
Problem:
--------
Today, network operators have very limited tools to debug routing
problems. Only primitive tools such as traceroute and ping are
commonly used to identify existing routing behavior. There is very
little visibility into the routing behavior of other ISPs' networks
from a given ISP's perspective, making it even more difficult to
identify the culprit of any routing anomalies. This also means that it
is difficult to predict the impact any routing policy change has on
the global routing behavior. Oftentimes, routing problems are noticed
only after a customer complains about reachability or severe
degradation of performance. There is lack of proactive, automated
analysis of routing problems that detect routing problems at early
stages. As certain routing problems initially may not be very obvious
and result in suboptimal and unintended routes. Diagnosing Internet
routing problems often requires analysis of data from multiple vantage
points.
Proposed solutions:
-------------------
(1) Build routing assertions, so that nothing fails silently. When
network operator configures a network, it is important to create a set
of assertions, equivalent to integrity constraints in database or
assertions in software programs. This generates the expected behavior
of the routing protocols in terms of which routes are allowed, the
resulting attributes of the routes, etc. These constraints can be
checked dynamically by a route monitor.
(2) cooperation among networks
Each network builds a measurement repository to collect data from
multiple locations. It builds a profile of the expected routing
behavior to quickly identify any deviations using statistic
techniques. Cooperation across networks is absolutely necessary to
diagnose global Internet routing problems. It is a challenge to
provide summaries of measurement data at sufficiently detailed level
to be useful but without revealing sensitive information about
internals of ISP's networks. A complementary approach is to allow
special distributed queries of the detailed network data from multiple
vantage points without direct access to the data.
(3) scalable distributed measurement interpretation and measurement
calibrations
Routing measurement (e.g., BGP) can result in significant data volume
and it may be infeasible to perform real-time or online interpretation
of such measurement data by combining all the data from multiple
locations in distinct networks at a centralized location. Distributed
algorithms are useful to interpret measurement results locally and
then aggregate them intelligently to identify routing anomalies.
Interpreting measurement can be challenging as there is a lack of
global knowledge of topologies and policies which can arbitrarily
translate a given measurement input signal to observed output
signals. We propose the use of calibration points to help identify
expected or normal routing behavior and correlate the output with the
input. Calibration points are well-controlled active measurement
probes with known measurement input. The BGP Beacons work is one such
example of an attempt to understand the patterns of output for a known
input routing change.
(4) Internet-wide emulation for network configurations
The impact of a single routing configuration change caused by a policy
change for example could be global; thus, it is important to emulate
the behavior in advance to study its impact. It is useful to abstract
the routing behavior in a single network at a higher level to study
the perturbation on the global routing system. Currently, the routing
configuration is done at a device level. Higher-level programming
support is needed to provide semantically more meaningful
configuration of networks. Predicting the output of a routing
configuration implicitly assumes that routing is deterministic.
However, nondeterministic routing may be more stable by preferring
routes that have been in the routing tables the longest. Such
tradeoffs are important to study.
(5) Understanding the interaction of multiple routing protocols and
implementation variants
Internet routing consists of multiple protocols, e.g., interdomain,
intradomain routing protocols, and MPLS label distribution protocol.
All these protocols interact to achieve end-to-end routing behavior
from an application's point of view. It is critical to understand
their dependency on each other. For instance, in BGP/MPLS IP VPNs,
the label distribution protocol is needed to set up label switched
paths across the network and if that is unsuccessful, BGP cannot find
a route. There is similar dependence of BGP on OSPF or IS-IS.
Implementation variants among router vendors determine routing
dynamics which is poorly understood. The interaction among the
variants may result in unexpected behavior and needs to be studied.
(6) Understanding routing "politics"
When a customer complains about routing problems either in terms of
reachability or poor performance, it typically is in the context of
some applications. Network operators install route filters in the
routers to determine which routes to accept in calculating the best
path to forward traffic. Packet filters at the routers are much more
flexible in the sense that they determine which packets are accepted
for forwarding based on attributes of the packets, e.g., port numbers,
protocol types. Given a route in one's routing table received by
one's upstream provider, there is no guarantee that all application
traffic can reach the destination due to the presence of packet
filters. Some networks, for instance, perform port-based filtering to
protect against known worm traffic. When debugging routing problems,
one needs to view from application's perspective to understand which
type of application traffic is correctly forwarded.
How to improve the application performance?
-------------------------------------------
Problem:
--------
Today, the Internet has no performance guarantees for real-time or
delay-sensitive applications, such as VoIP, gaming, especially if
traffic goes across multiple networks. To obtain flexible routing in
terms of control over cost and performance of network paths, end users
resort to either multihoming to multiple networks or overlay routing.
However, studies have shown that there may be potential adverse
interaction between application routing and traffic engineering at the
IP layer. Multihoming, similarly, is not a perfect solution as it does
not directly translate to paths with performance guarantees, has
little impact on how incoming traffic reaches the customers, and may
further amplify the amount of routing traffic during convergence.
Proposed solution:
------------------
Application is the king: correlate routing with forwarding plane,
evaluate and improve in the context of application performance
metrics: delay, loss rate, and jitter.
When studying routing protocol performance, researchers often use
convergence delay as a universal metric. However it does not
translate directly to metrics applications care about, e.g., delay,
loss rate, and jitter. Understanding the stability of such
measurements as a function of the network topology and time provides a
way for overlay routing algorithms to intelligently route around
network problems. Application performance measurements also expose the
detailed interaction between the dynamics of forwarding plane and
control plane.
How to protect the routing system?
----------------------------------
Problem:
--------
There has been relatively little studies on protecting the Internet
routing infrastructure against attacks. Vulnerabilities in router
architectures are relatively unknown and have not been widely
exploited. The routing system can also be indirectly affected due to
enormous traffic volume. Recently, there has been a large number of
worms exploiting end host OS vulnerability. Significant attack traffic
volume causes router sessions to time out. Session resets result in
exchange of entire routing tables and disruption of routing. Cascaded
failures can occur if the session reset traffic subsequently cause
router overload and other peering sessions to be affected.
Proposed solution:
------------------
(1) Understanding vendor implementation of routing protocols
Through detailed black-box testing and support from vendors, one can
better understand the obscure, undocumented behavior of routers that
are not documented in RFCs and their implication on router security.
(2) Understanding vulnerability points on the Internet
Network topology and policy information are more widely known through
various Internet mapping effort. Such mapping efforts help us discover
vulnerability points by analyzing failure scenarios.
(3) Higher priority for routing traffic
The delay and loss of routing traffic, especially keepalive HELLO
messages, can cause sessions to reset. This can occur when there is
significant data traffic. Increasing the queuing and processing
priority of routing packets in the routers is one possibility to
reduce the impact of bandwidth attacks on the routing system.
(4) Automated dynamic installation of packet and route filters
The attack against windowsupate.com was prevented just in time by
invalidating the relevant DNS entry in the DNS system, which takes at
least 24 hours to propagate any change globally. To react to any
attacks in real time, there needs to be a faster and automated
way. One possibility is to dynamically install relevant packet and route
filters across a selected set of networks to eliminate/reduce the
impact of the attacks. Routers have limited memory for such filters
and the order of the filters determine the actual routes or packets
permitted. We need to study efficient algorithms to compute such
filters on the fly.
ON BGP MUTATIONS
================
The collapse of the Internet has already been predicted lots of times.
Some researchers and practitioners have damned BGP, and proposals for
finding a replacement for BGP are resounding throughout the community.
But before proposing changes to existing protocols, we should understand
the origins of todays problems. We have to grasp the design decisions,
the interactions, and the scalability limitations of the current
implementations. Regarding BGP this in-depth understanding is clearly
not present.
In the following I like to envisage three areas in which BGP may/should
evolve in the next few years. Those areas can be viewed as short-, mid-,
and long-term goals.
1. Vendor implementation issues
(or: convergence and scalability questions)
Convergence times in the Internet are still in the order of several
minutes. Regarding the critical importance and compared to telephone
networks, this is no longer acceptable!
But the protocol does not have to be changed to improve convergence.
The limiting factors are vendor specific implementations details,
settings of timers and parameters as well as overloaded routers
[see Appendix A].
Just to pick one example, consider the propagation of updates in
I-BGP through a series of route reflectors (RR): Updates will be
delayed by approximately 10 seconds per RR by MRAI. Changing this
timer setting or changing the network design (reduced number of
cascaded RR that the update has to pass) will speed up convergence
without protocol modifications. This example leads to the second
area:
2. Human-factor issues
(or: misconfiguration questions)
Network design as well as router configuration is not a trivial task.
(e.g., [Caldwell03]). Therefore human error in router configuration
and network design happens every day [Mahajan02].
Various homegrown tools and approaches exists (e.g., see presentation
at operators forums such as [NANOG]). Still research needs to focus
more on solutions to minimize the error potential.
Here tools and accurate databases are desperately needed, but no
changes to the protocols are necessary to minimize human errors.
On the other hand it is known that certain configuration mistakes can
lead to BGP oscillations (e.g. [RFC3345]). The current approach is a
patchwork which fixes bugs when they occur. This is not acceptable
and we need some protocol enhancements, which leads to the third
area:
3. Protocol-design issues
(or: protocol divergence, inter-domain TE, etc. questions)
One beloved feature of BGP is that it is completely configurable
through policies, but Tim Griffin has shown that todays existing MED
oscillations are just the tip of the iceberg and that BGP can lead to
diverging states on a much larger scale (e.g., [Griffin99]).
There are further demands from the market that can't be satisfied
with our current version of BGP. This includes inter-domain
equal-cost-multipath, "online" inter-domain traffic engineering (a la
routeScience), etc... All this will not be possible as long as the
best path decision process of a router selects only one best route.
Furthermore additional information about causes and origins of
routing instabilities would be helpful for operators to locate and
debug routing problems.
Even though the list above does not claim to be exhaustive, it is
clear that some enhancement to BGP will be unavoidable!
How will BGP evolve?
Quite logically, vendors are mainly implementing those features that the
market is supposed to buy (e.g., MPLS/VPNs). From my perspective, all
three areas mentioned above, are not very attractive to vendors (e.g.
low cost-benefit ratio), but important for the future of the Internet.
That is the reason why those areas need support from research to evolve.
To approach those problems, we need an in-depth understand of protocol
details, router limitations and interactions between protocols as well
as propagation patterns through the topology. Research should start with
answering questions from the following categories:
1. Protocol analysis
Identify the root causes and the location of triggering events.
Investigate interactions between routing protocols and topology.
For example questions here could be: How to identify the AS which
originated an update? How many updates are due to what kind of
events?
2. Equipment scalability tests
Understand the scaling limitations of todays equipment before judging
about the deployment of additional features.
For example questions here could be: How long does an update spend
inside a router (under certain load conditions)? How much more load
can inter-domain traffic engineering or a lower MRAI value impose on
a router?
3. Simulation
Use network simulation to understand how routing updates traverses
the network. Investigate interactions of various timers, of policies,
between IGPs and BGP, etc.
Example questions here could be: How to implement BGP in a way that
the number of "dispensable" updates (caused by interconnectivity and
timers) can be limited? ...
BGP is a protocol which evolved in over 15 years now. The most important
part is that network operators have full control over all settings and
their route distributions.
My conclusion, regarding the future of BGP, is that a lot of problems
that we have with todays routing, are fixable within BGP and should be
fixed soon. Furthermore that enrichment (e.g., optional add-ons) to BGP
are not only necessary, but unavoidable! On the other hand a replacement
protocol will have a hard stand on the market.
Therefore "mutations" are possible, but a replacement will be crushed by
"natural selection". This is the part of evolution theory that BGP is
subjected to - from my point of view.
References
----------
[Griffin99] T. G. Griffin, and G. Wilfong, "An analysis of BGP
convergence properties," in Proc. ACM SIGCOMM,
September 1999.
[RFC3345] D. McPherson, V. Gill, D. Walton, and A. Retana, "Border
Gateway Protocol (BGP) Persistent Route Oscillation
Condition", Request for Comments 3345, August 2002.
[Mahajan02] R. Mahajan, D. Wetherall, and Tom Anderson, "Understanding
BGP Misconfiguration", ACM SIGCOMM, August 2002
[NANOG] The North American Network Operators' Group
http://www.nanog.org/
[Caldwell03] Don Caldwell, Anna Gilbert, Joel Gottlieb, Albert
Greenberg, Gisli Hjalmtysson, and Jennifer Rexford, "The
cutting EDGE of IP router configuration," unpublished
report, July 2003.
------------------------------------------------------------------------
Appendix A: Example, "the MRAI fight"
-------------------------------------
A critical factor in BGP update distribution is the Minimum Route
Advertisement Interval (MRAI) and the way it is implemented in router
software. The basic idea behind this timer is, to collect first all
updates arriving from different peers and pass only one "best" update
on. The RFC suggests that after one update for one prefix was send to
one peer, there should be a (jittered) delay of 30 seconds before
another update for the same prefix can be send to the same peer. Indeed
this limits the number of BGP messages that need to be exchanged. We
note that certain vendor specific implementations differs a lot from the
recommendation in the RFC and therefore introduce a significant
different propagation picture. Here are two examples:
From our current understanding of Cisco's MRAI implementation there are
two major differences with regards to the RFC. The first difference is,
that the timer is implemented on a per peer basis instead of per prefix
basis. Scalability reasons does not allow an implementation per peer and
prefix, but therefore almost ALL outgoing updates will be delayed - not
just two consecutive updates (close in time and belonging to one
prefix)! That means, that each and every update will be queued and only
propagated when the timer expires. The second difference is, that MRAI
is holding back withdraws as well as announcements. This is a major
cause of the observed BGP path exploration phenomena.
Our current understanding is that MRAI on Junipers is called,
"out-delay" [https://www.juniper.net/techpubs/software/junos/junos57/
swconfig57-routing/html/bgp-summary32.html] and is disabled by default.
That means, Juniper is not holding back any BGP update messages. Indeed
this speeds up convergence, but at the risk that much more updates will
be send - which in turn triggers more damping.
The trade-off in this fight is between faster propagation and more
protocol messages. It is clear that in todays Internet more protocol
messages would lead to more damping, which doesn't improve convergence.
Even in a fictional Internet without damping, more protocol messages
would burn more CPU time. Therefore future research has to show whether
this is desirable (consider todays CPU speeds), or not (because of
scalability considerations).
A Case for RIP (Re-architecting the Internet Protocols)
Tom Anderson
University of Washington
September 2003
This position paper starts from the premise that we are not in
control. The primary determining factors for how Internet
routing will evolve over the next decade are the long term
trends in the relative cost-performance of communication,
computation, and human brainpower. Academic research can help
optimize solutions to match these trends, but it can't buck
them. Even the tussles between competing vendors and interest
groups, issues that can have substantial impact in the short
term, are over the long term steamrollered by technology
trends.
What are these trends? Averaged over the past 30 years, wide
area communication has improved in cost-performance at roughly
60% per year. While prices are never simply a direct
reflection of costs, reflecting the ebb and flow of monopoly
positions, over the long term they track fairly closely. And
it is this long term improvement in cost-performance, rather
than any intrinsic nature of the Internet, which drives the
long term trends in Internet usage and operations. For
example, the transmission bandwidth for an hour-long
TV-quality teleconference would have cost $500 a decade ago,
while 10 years from now it will cost a nickel. Of course this
difference will result in a vast increase in the amount of
multimedia content distributed over the Internet.
While the long term improvement in WAN cost-performance seems
impressive, it pales compared to computing, local area
communication, and DRAM (each of which has improved at between
80-100% per year for the past 30 years). Moore's Law gets the
publicity (the 60% per year improvement in circuit density),
but that figure misses a key factor - volume manufacturing.
Roughly ten billion microproces-sors were manufactured last
year, compared to only a handful of wide area communication
line cards; thirty years ago, the numbers were closer to
parity. High volume technologies have a significant long term
edge in cost-performance. While a gap of 20-40% may not seem
like much in any given year, over the long term it adds up to
about an order of magnitude per decade. (To the extent that
prices diverge from costs, it is accentuating this effect -
the Internet is a less efficient market than CPUs and DRAM,
and thus is scaling even less quickly in the near term.)
One consequence is that the Internet was designed for a far
different world than the one we have today or will have in ten
years. Thirty years ago, human time was cheap, and
computation and communication were expensive. Today's
Internet, and increasingly so in the future, is one where
humans are expensive, wide area communication is cheap, and
computation is virtually free. Indeed, the Internet became
possible at the point that computation became cheap enough
that we could afford to put a computer at the end of every
wide area link - that is, at the point that computation and
communication reached parity. The Internet would not have
been feasible, purely from a cost standpoint, in 1960. Even
fifteen years ago, TCP congestion control was carefully
designed to minimize the cycles needed to process each packet;
few would claim that TCP packet processing overhead is the
limiting factor for practical wide area communi-cation today.
Recall that firewalls were considered too slow a decade ago;
today, they still are, but only for LAN traffic. These trends
will continue - activities such as routing overlays, link
compression, and traffic shaping, considered perhaps too slow
to be practical today, will eventually become commonplace.
This suggests that we should answer two questions. How will
the Internet evolve in response to these trends, and what can
we do as researchers to leverage them to make the Internet
more efficient, more reliable, and more secure? We make
several observations:
Ubiquitous optimization of backbone hardware. BGP is
explicitly designed for scalability over performance, and thus
is ill-suited for the kinds of optimizations that are likely
in the future. It is often impossible even to express optimal
policies in BGP. Similar problems occur at the intradomain
level; it is idiotic to have an architecture that requires
humans in the back room to twiddle link weights for good
perform-ance. The research challenge will be how to adapt our
routing protocols to accommodate ubiquitous op-timization.
Fortunately, networks will be run at the knee of the curve -
it makes no sense to run a network at high utilization if that
delays end users. The control theory problems of managing
traffic flows over large, heterogeneous networks become much
simpler at low to moderate utilization.
Cooperation as the common case. A widespread myth is that
Internet routing is dominated by competition - the "tussle"
between competing providers. In the short term, the tussle
seems paramount, but over the long term, delivering good
performance to end users matters, and that is only possible
when providers cooperate. Indeed, measurement studies have
shown that even today cooperation heavily influences the
selection of Internet routes. Unfortunately, BGP is
ill-designed for cooperation - even something as sim-ple as
picking the best exit, as opposed to the earliest or latest,
is a management nightmare in BGP. How can we re-design our
protocols to make cooperation efficient, and unfriendly
behavior visible and penalized?
Accurate Internet weather. Many ISPs like to think of their
operations as proprietary, but information necessarily leaks
out about those operations along a number of channels. Recent
measurement work has shown that it is possible to infer almost
any property of interest, including latency, capacity,
workload, policy, etc. We believe an accurate hour by hour
(or even minute-by-minute) picture of the Internet can be
cost-effectively gathered from a network of vantage points.
Leveraging this information in routing and congestion control
design is a major research challenge.
Sophisticated pricing models. Pricing models will become much
more complex, both because we'll be able to measure and
monitor traffic cost-effectively at the edges of networks, and
because the character of traffic affects how efficiently we
can run a network. Smoothed traffic will be charged less than
bursty traffic, since it allows for higher overall utilization
of expensive network hardware with less impact on other users.
Internet pricing already reflects these effects at a
coarse-grained level, as off-peak bandwidth is essentially
free. The trend will be to do this at a much more
fine-grained level. Smoother traffic makes routing
optimizations easier, but perhaps the more interesting
question is how traffic shapers interoperate across domains to
deliver the best performance to end users - in essence, how do
we take the lessons we've learned from interdomain policy
management in BGP and apply them to TCP?
Interoperable boundary devices. Far from being "evil" and
contrary to the Internet architecture, they are a necessary
part of the evolution of the Internet, as the cost-performance
of computation scales better than that of wide area
communication. Even today, sending a byte into the Internet
costs the same as 10000 instructions (at least in the US, the
ratio for foreign networks is even higher). The challenge is
making these edge devices interoperate and self-managing - the
only way to build a highly secure, highly reliable, and high
performance network is to get humans out of the loop. The
end to end principle in particular is a catechism for a
particular technology age - instead of thinking of how a huge
number of poorly secured end devices can work together to
manage the Internet, we will instead ask how a smaller number
of edge devices can cooperate among themselves to provide
better Internet service to their end users.
High barriers to innovation. As we help evolve the Internet
to better cope with the challenges of the future, it is
important to remember that routers are a low volume product.
As typical of any niche software system, this makes them
resistant to change, since engineering costs can dominate. As
researchers, we can help by redesigning protocols so that they
are radically easier to implement, manage, and evolve.
These observations and research challenges are animating our
work on RIP at UW.