Networking 101: Understanding BGP Routing
Networking 101: Understanding BGP
Routing
Border Gateway Protocol (BGP) can be
critical for successful enterprise network administration. Brush up with
our primer.
The
Border Gateway Protocol (BGP) is the routing protocol of the Internet, used to
route traffic across the Internet. For that reason, it's a pretty important
protocol, and it can also be the hardest one to understand.
From
our overview of Internet routing, you should realize that routing in the Internet is
comprised of two parts: the internal fine-grained portions managed by an IGP
such as OSPF, and the interconnections of those autonomous systems (AS) via
BGP.
Who needs to understand BGP?
BGP
is relevant to network administrators of large organizations which connect to
two or more ISPs, as well as to Internet Service Providers (ISPs) who connect
to other network providers. If you are the administrator of a small corporate
network, or an end user, then you probably don't need to know about BGP.
BGP basics
- The current version of BGP is BGP version 4, based on RFC4271.
- BGP is the path-vector protocol that provides routing information for autonomous systems on the Internet via its AS-Path attribute.
- BGP is a Layer 4 protocol that sits on top of TCP. It is much simpler than OSPF, because it doesn’t have to worry about the things TCP will handle.
- Peers that have been manually configured to exchange routing information will form a TCP connection and begin speaking BGP. There is no discovery in BGP.
- Medium-sized businesses usually get into BGP for the purpose of true multi-homing for their entire network.
- An important aspect of BGP is that the AS-Path itself is an anti-loop mechanism. Routers will not import any routes that contain themselves in the AS-Path.
Why do you need to understand BGP?
When
BGP is configured incorrectly, it can cause massive availability and security
problems, as Google discovered in 2008 when its YouTube service became
unreachable to large portions of the Internet. What happened was that, in an
effort to ban YouTube in its home country, Pakistan Telecom used BGP to route
YouTube's address block into a black hole. But, in what is believed to have
been an accident, this routing information somehow got transmitted to Pakistan
Telecom's Hong Kong ISP and from there got propagated to the rest of the world.
The end result was that most of YouTube's traffic ended up in a black hole in
Pakistan.
More
sinisterly, 2003 saw a number of BGP hijack attacks, where modified BGP route
information allowed unknown attackers to redirect large blocks of traffic so
that it travelled via routers in Belarus or Iceland before it was transmitted
on to its intended destination.
Clearly,
BGP is significant. Here we'll provide a short overview of how BGP works, along
with the problems it solves and causes.
Autonomous systems
First
a little terminology. In the world of BGP, each routing domain is known as an
autonomous system, or AS. What BGP does is help choose a path through the
Internet, usually by selecting a route that traverses the least number of
autonomous systems: the shortest AS path.
You
might need BGP, for example, if your corporate network is connected to two
large ISPs. To use BGP you would need an AS number, which you can get from the
American Registry of Internet Numbers (ARIN).
Once
BGP is enabled, your router will pull a list of Internet routes from your BGP
neighbors, who in this case will be your two ISPS. It will then scrutinize them
to find the routes with the shortest AS paths. These will be put into the
router's routing table. (If you only connect to a single ISP then you don't
need BGP. That's because there's only one path to the Internet, so there's no
need for a routing protocol to select the best path.)
Generally,
but not always, routers will choose the shortest path to an AS. BGP only knows
about these paths based on updates it receives.
Route updates
Unlike
Routing Information Protocol (RIP), a distance-vector routing protocol which
employs the hop count as a routing metric, BGP does not broadcast its entire
routing table. At boot, your peer will hand over its entire table. After that,
everything relies on updates received.
Route
updates are stored in a Routing Information Base (RIB). A routing table will
only store one route per destination, but the RIB usually contains multiple
paths to a destination. It is up to the router to decide which routes will make
it into the routing table, and therefore which paths will actually be used. In
the event that a route is withdrawn, another route to the same place can be
taken from the RIB.
The
RIB is only used to keep track of routes that could possibly be used. If a
route withdrawal is received and it only existed in the RIB, it is silently
deleted from the RIB. No update is sent to peers. RIB entries never time out.
They continue to exist until it is assumed that the route is no longer valid.
BGP path attributes
In
many cases, there will be multiple routes to the same destination. BGP
therefore uses path attributes to decide how to route traffic to specific
networks.
The
easiest of these to understand is Shortest AS_Path. What this means is the path
which traverses the least number of AS "wins."
Another
important attribute is Multi_Exit_Disc (Multi-exit discriminator, or MED). This
makes it possible to tell a remote AS that if there are multiple exit points on
to your network, a specific exit point is preferred.
The
Origin attribute specifies the origin of a routing update. If BGP has multiple
routes, then origin is one of the factors in determining the preferred route.
BGP issues
To
get a true sense of how BGP works, it's important to spend some time talking
about the issues that plague the Internet.
First,
we have a very big problem with routing table growth. If someone decides to
deaggregate a network that used to be a single /16 network, they could
potentially start advertising hundreds of new routes. Every router on the
Internet will get every new route when this happens. People are constantly
pressured to aggregate, or combine multiple routes into a single advertisement.
Aggregation isn't always possible, especially if you want to break up a /19
into two geographically separate /20s. Routing tables are approaching 200,000
routes now, and for a time they were appearing to grow exponentially.
Second,
there is always a concern that someone will "advertise the Internet."
If some large ISP's customer suddenly decides to advertise everything, and the
ISP accepts the routes, all of the Internet's traffic will be sent to the small
customer's AS. There's a simple solution to this. It's called route filtering.
It's quite simple to set up filters so that your routers won't accept routes
from customers that you aren't expecting, but many large ISPs will still accept
the equivalent of "default" from peers that have no likelihood of
being able to provide transit.
Finally,
we come to flapping. BGP has a mechanism to "hold down" routes that
appear to be flaky. Routes that flap, or come and go, usually aren't reliable
enough to send traffic to. If routes flap frequently, the load on all Internet
routes will increase due to the processing of updates every time someone
disappears and reappears. Dampening will prevent BGP peers from listening to
all routing updates from flapping peers. The amount of time one is in hold-down
increases exponentially with every flap. It's annoying when you have a faulty
link, since it can be more than an hour before you can get to many Internet
sites, but it is very necessary.
This
quick discussion of BGP should be enough to get you thinking the right way
about the protocol but is by no means comprehensive. Spend some time reading
the RFCs if you're tasked with operating a BGP router. Your peers will
appreciate it.
Border Gateway Protocol (BGP) is the routing protocol that literally makes the Internet work, yet its complexity makes it essential to know how to troubleshoot problems quickly.
Service providers working with IP networks are
very clear that the Border Gateway Protocol (BGP) is the most complex and
difficult to configure Internet protocol. Its emphasis on security and
scalability makes it essential, however. This guide offers you a detailed look
at how and why BGP-enabled routers in core networks exchange information
securely with several hundred thousand IP prefixes, as well as simple and
advanced approaches for troubleshooting connectivity problems.
Introduction to Border Gateway Protocol (BGP)
If you have to explain to someone new to the
service provider environment what Border Gateway Protocol (BGP) is, the best
definition would be that it's the routing protocol that makes the Internet
work. As the address allocation in the Internet is nowhere nearly as
hierarchical as the telephone dialing plan, most of the routers in the service
provider core networks have to exchange information about several hundred
thousand IP prefixes. BGP is still able to accomplish that task, which is a
good proof that it's a highly scalable routing protocol.
BGP defined
BGP (Border Gateway Protocol) is a protocol
for exchanging routing information between gateway hosts
(each with its own router) in
a network of autonomous
systems. BGP is often the protocol used between gateway hosts on the
Internet. The routing table contains a list of known routers, the addresses
they can reach, and a cost metric associated
with the path to each router so that the best available route is chosen.
Hosts using BGP communicate using the Transmission Control Protocol (TCP) and send updated router table information only when one host has detected a change. Only the affected part of the routing table is sent. BGP-4, the latest version, lets adminstrators configure cost metrics based on policy statements. (BGP-4 is sometimes called BGP4, without the hyphen.)
BGP communicates with autonomous (local) networks using Internal BGP (IBGP) since it doesn't work well with IGP. The routers inside the autonomous network thus maintain two routing tables: one for the interior gateway protocol and one for IBGP.
BGP-4 makes it easy to use Classless Inter-Domain Routing (CIDR), which is a way to have more addresses within the network than with the current IP address assignment scheme.
BGP is a more recent protocol than the Exterior Gateway Protocol (EGP). Also see the Interior Gateway Protocol (IGP) and the Open Shortest Path First (OSPF) interior gateway protocol.
Hosts using BGP communicate using the Transmission Control Protocol (TCP) and send updated router table information only when one host has detected a change. Only the affected part of the routing table is sent. BGP-4, the latest version, lets adminstrators configure cost metrics based on policy statements. (BGP-4 is sometimes called BGP4, without the hyphen.)
BGP communicates with autonomous (local) networks using Internal BGP (IBGP) since it doesn't work well with IGP. The routers inside the autonomous network thus maintain two routing tables: one for the interior gateway protocol and one for IBGP.
BGP-4 makes it easy to use Classless Inter-Domain Routing (CIDR), which is a way to have more addresses within the network than with the current IP address assignment scheme.
BGP is a more recent protocol than the Exterior Gateway Protocol (EGP). Also see the Interior Gateway Protocol (IGP) and the Open Shortest Path First (OSPF) interior gateway protocol.
The Border Gateway Protocol routing information
is usually exchanged between competing business entities -- Internet Service
Providers (ISPs) -- in an open, hostile environment (public Internet). BGP is
thus very security-focused (for example, all adjacent routers have to be
configured manually), and decent BGP implementations provide a rich set of
route filters to allow the ISPs to defend their networks and control what they
advertise to their competitors.
In BGP terminology, an independent routing domain
(which almost always means an ISP) is called an autonomous system.
BGP is always used as the routing protocol of
choice between ISPs (external BGP) but also as the core routing protocol within
large ISP networks (internal BGP).
All other routing protocols are concerned solely
with finding the optimal path toward all known destinations. BGP cannot take
this simplistic approach because the peering agreements between ISPs almost
always result in complex routing policies. To help network operators implement
these policies, BGP carries a large number of attributes with each IP prefix,
for example:
- AS path -- the complete path documenting which autonomous systems a packet would have to travel through to reach the destination.
- Local preference -- the "internal cost" of a destination, used to ensure AS-wide consistency.
- Multi-exit discriminator -- this attribute gives adjacent ISPs the ability to prefer one peering point over another.
- Communities -- a set of generic tags that can be used to signal various administrative policies between BGP routers.
As the focus of BGP design and implementation was
always on security and scalability, it's harder to configure than other routing
protocols, more complex (more so when you start configuring various routing
policies), and one of the slowest converging routing protocols.
The slow BGP convergence dictates a two-protocol
design of an ISP network:
- An internal routing protocol (most often, OSPF or IS-IS) is used to achieve fast convergence for internal routes (including IP addresses of BGP routers).
- BGP is used to exchange Internet routes.
A failure within the core network would thus be
quickly bypassed thanks to fast convergence of OSPF or IS-IS, whereas BGP on
top of an internal routing protocol would meet the scalability, security and
policy requirements. Even more, if you migrate all your customer routes into
BGP, the customer problems (for example, link flaps between your router and
customer's router) will not affect the stability of your core network.
Because of inherent BGP complexity, customers and
small ISPs would deploy BGP only where needed, for example on peering points
and a minimal subset of core routers (the ones between the peering points), as
shown in the following diagram.
The BGP-speaking routers would also have to
generate a default route into the internal routing protocol to attract the
traffic for Internet destinations not known to other routers in your network.
As your ISP business grows, however, your
customers will start requiring BGP connectivity (any customer who wants to
achieve truly redundant Internet access has to have its own AS and exchange BGP
information with its ISPs), and you'll be forced to deploy BGP on more and more
core and edge routers (see the following picture). It's therefore best that you
include BGP on all core and major edge routers as part of your initial network
design. Even though you might not deploy it everywhere with the initial network
deployment, having a good blueprint will definitely help you when you have to
scale the BGP-speaking part of your network.
BGP requires a full mesh of internal BGP sessions
(sessions between routers in the same autonomous system). You could use BGP
route reflectors or BGP confederations to make your network scalable.
There is also another excellent reason why you'd
want to deploy BGP throughout your network: Novel network service, for example
MPLS-based virtual private networks (VPNs), large-scale quality-of-service
deployments, or large-scale differentiated Web caching implementations rely on
BGP to transport the information they need.
BGP troubleshooting: Simple approach
Border Gateway Protocol (BGP) is without doubt
the most complex IP routing protocol currently deployed in the Internet. Its
complexity is primarily due to its focus on security and routing policies – BGP
is used to exchange cooperative information (Internet routes) between otherwise
competing entities (service providers) and has to be able to implement whatever
has been agreed upon in the inter-provider peering agreements. (These
agreements often have little to do with technically optimum solutions.)
However, a structured approach to BGP
troubleshooting, as illustrated in this and the next section can quickly lead
you from initial problem diagnosis to the solution. Here we focus on a simple
scenario with a single BGP-speaking router in your network (see the following
diagram). Similar designs are commonly used by multi-homed customers and small
Internet service providers (ISPs) that do not offer BGP connectivity to their
customers.
Is it a BGP problem?
Before jumping into BGP troubleshooting, you have
to identify the source of the connectivity problem you're debugging (usually
you suspect that BGP might be involved if one of your customers reports limited
or no Internet connectivity beyond your network). Perform a traceroute from a
workstation on the problematic LAN; if the trace reaches the first BGP-speaking
router (or, even better, gets beyond the edge of your network) router, you're
probably dealing with a BGP issue. Otherwise, check whether the BGP-speaking
router advertises a default route into your network (without a default route,
other routers in your network cannot reach the Internet destinations).
If you don't have access to a LAN-attached workstation,
you can perform the traceroute from the customer premises router, but you have
to ensure that the source IP address used in the traceroute packets is the
router's LAN address.
Troubleshooting BGP adjacencies
BGP has to establish TCP session between adjacent
BGP routers before they can exchange routes. The first check is thus the status
of the BGP sessions between the routers.
The BGP neighbors are configured manually, and
the two most probable configuration errors are:
- Neighbor IP address mismatch: The destination IP address configured on one BGP neighbor has to match the source IP address (or the IP address of the directly connected interface) configured on the other.
- AS number mismatch: The neighbor AS number configured on one side of the BGP session has to match the actual BGP AS number used by the neighbor.
You could also have a problem with packet filters
deployed on the BGP-speaking router. These filters have to allow packets to and
from TCP port 179.
Troubleshooting route propagation
If your users want to receive traffic from the
Internet, the IP prefix assigned to your network must be visible throughout the
Internet. To get there, three steps are needed:
- Your BGP router must insert your IP prefix into its BGP table.
- The IP prefix must be advertised to its BGP neighbors.
- The IP prefix must be propagated throughout the Internet.
Is the route inserted into BGP?
Most routing protocols automatically insert
directly connected IP subnets into their routing tables (or databases). Owing
to security requirements, BGP is an exception; it will originate an IP prefix
only if it's manually configured to do so (for example, Cisco routers use the
network statement to configure advertised IP prefixes). Another option is route
redistribution, which is highly discouraged in the Internet environment.
Furthermore, to avoid attracting unroutable
traffic, BGP will announce a configured IP prefix only if there's a matching
route in the IP routing table. You could generate the matching IP route through
route summarization, but it's usually best to configure a static route pointing
to a null interface (or its equivalent).
To check whether your IP prefix is in your BGP
routing table, use a BGP show command (for example, show ip bgp prefix mask
on a Cisco router).
Is the route advertised to your neighbors?
By default, all IP prefixes residing in the BGP
table are announced to all BGP neighbors. Owing to security and routing policy
requirements, the default behavior is usually modified with a set of output and
input filters. If you have applied output filters toward your BGP neighbors,
you have to check whether these filters allow your IP prefix to be propagated
to the external BGP neighbors. The command to display routes advertised to a
BGP neighbor on a Cisco router is show ip bgp neighbor ip-address advertised.
Is the route visible throughout the Internet?
Even if you've successfully announced your IP
prefix to your BGP neighbors, it might still not be propagated throughout the
Internet. It's hard to figure out exactly what's propagated beyond the
boundaries of your network; the tools that can help you are called BGP looking glasses. Using these
tools, you can inspect BGP tables at various points throughout the Internet and
check whether your IP prefix has made it to those destinations.
There are a few factors that could cause your IP
prefix to be blocked somewhere in the Internet. The most common one is BGP
route flap dampening: If an IP prefix flaps (disappears and reappears) too
often in a short period of time -- for example, you clear your BGP sessions or
change your BGP configuration -- the prefix gets blocked for an extended period
of time (by default, up to an hour). If your IP prefix is dampened, there's
nothing you can do except wait it out. You could also have an invalid (or
missing) entry in IP routing registries, or there may be inbound filters at one
of the upstream ISPs. In all these cases, it's best if your upstream ISP can
help you resolve the problem (which is, at this point, beyond the scope of
technical BGP troubleshooting).
BGP troubleshooting: Advanced approach
In the previous section of this e-guide we
addressed some basic BGP troubleshooting skills:
- How to identify whether a routing problem is a BGP problem,
- How to troubleshoot BGP sessions,
- How to troubleshoot IP route origination and propagation.
Now let's we focus on a more advanced scenario:
transit Internet service provider (ISP) networks (see the next diagram).
NOTE: Before reading this section, make
sure you've read section and two to become familiar with basic Border Gateway Protocol
technology as well as simple BGP troubleshooting.
To establish end-to-end connectivity across a
service provider network, the ISP has to receive customers' IP prefixes via BGP
and announce them to other ISPs. The same process has to happen in reverse
direction (or at least the default route has to be announced to the customer).
The network-wide BGP troubleshooting is thus composed of three steps:
- Have we received the prefix?
- Is the prefix propagated across our network?
- Is the prefix sent to external BGP neighbors at the other edge of the network?
Have we received the prefix?
Troubleshooting inbound BGP problems is the
toughest part of BGP troubleshooting you'll encounter. There are two potential
reasons that an IP prefix is not in your BGP table as you would expect it to
be:
- The neighbor is not sending the prefix.
- Your inbound filters are blocking the prefix.
The only tool that can help you identify the
problem is the debugging facility on your edge router (as you normally don't
have access to the other BGP neighbor). When doing BGP debugging, be aware that
a BGP neighbor can send you several hundred thousand routes, so you have to
ensure that the debugging output produced by the troubleshooting session does
not overwhelm the router. Furthermore, the BGP prefixes are sent only when they
change, not on a periodic basis (like RIP updates or OSPF LSA floods). Your
debugging tool will thus not show you an IP prefix until it has actually
changed (or you've cleared the BGP session with your neighbor).
Some BGP routers have the ability to store a
separate copy of all routes sent by a neighbor into a parallel BGP table. (To
enable this functionality on Cisco IOS, you have to configure soft-reconfiguration
in for a BGP neighbor.) With the parallel per-neighbor table, you can
exactly pinpoint what the neighbor has sent you (the content of the parallel
table) and what routes have passed your input filters (the contents of the main
BGP table), but of course the parallel per-neighbor table consumes a large
amount of memory.
Is the prefix propagated across our network?
Even when an edge router receives an IP prefix
via BGP, it may not be propagated to the other end of your network. To start
with, internal BGP (BGP within a single autonomous system) requires a full mesh
of BGP sessions among all BGP routers. As every router between every pair of
edge routers has to run BGP (otherwise the traffic could be dropped inside your
network), the number of BGP sessions could become excessively large. (The next
diagram illustrates the BGP sessions needed in a small four-router network.)
There are two tools (BGP route reflectors and BGP
confederations) that can help you keep the number of BGP sessions to a sensible
level, with BGP route reflectors being the most commonly used.
The BGP route reflector rules are quite simple:
- Whatever is received from a route-reflector client or an external BGP peer will be sent to every other BGP peer.
- Whatever is received from a router that is not a route-reflector client will be sent only to clients and external BGP peers.
With these rules in hand, you have to step
through the graph of BGP sessions in your network, checking every BGP router on
the way and ensuring that the route reflector rules are not violated (and that,
using the rules, the BGP prefixes get from every edge router to all other
routers).
There is another common reason an IP prefix is
not propagated across your network: The external subnets on the edge of your
network are not advertised to your core routers.
The IP address of the next-hop router is not
changed when an IP prefix is sent to an internal BGP neighbor. The IP next-hop
of an external route is thus always the IP address of a router one hop
beyond the edge of your autonomous system. The IP subnets connecting your
edge routers to their external neighbors thus have to be inserted into your
internal routing protocol (for example, OSPF or IS-IS), otherwise some internal
BGP router will decide that the BGP next-hop is not reachable and ignore the IP
prefix. (It will appear in the BGP table but will not be used or propagated to
other BGP peers.)
Is the prefix sent to external neighbors?
As the last step in troubleshooting BGP route
propagation, you have to check whether the IP prefixes transported across your
network are announced to your external BGP peers. The techniques for
troubleshooting outbound BGP route propagation are explained in the Border
Gateway Protocol (BGP) troubleshooting: Simple approach article.
Is the traffic traversing the network?
Even if your BGP route propagation works
flawlessly, the IP packets may not be able to traverse your network. (Remember,
we're talking about pure IP networks here; things change a bit if you add MPLS
to the mix.) The most common cause of a "black hole" in your network
is a router in the transit path that does not run BGP and consequently has no
idea how to route the received IP packet toward the destination network.
IP routing works hop by hop. Even though the
ingress edge router knows exactly which egress edge router to use and how to
get there, it cannot pass that information to the intermediate routers. All of
them must therefore run BGP as well.
To identify a black hole in your network, perform
a traceroute from your customer's network to a destination in the Internet. The
last router responding to the traceroute is one hop before the black hole.
Even though all core routers in your network have
to run BGP, the internal BGP sessions don't have to follow the physical
structure of the network. For example, you could have a few central routers
acting as BGP route reflectors for all BGP routers in your network.
Comments