Networking / Beginners

Use of MP-BGP Extensions for IPv6 Interdomain Routing

Multiprotocol BGP4 (MP-BGP), specified in RFC 2858, defines extensions enabling BGP4 to carry routing information for multiple network layer protocols. Specific network layer protocol bits are specified in separate RFCs: for IPv6, it is RFC 2545. Only three pieces of information carried by BGP4 are IPv4 specific:

  • The next-hop attribute contains the IPv4 address of the via router.
  • The aggregator attribute contains the ASN and the IPv4 address of the router performing the aggregation.
  • The Network Layer Reachability Information (NLRI) is a set of IPv4 prefixes, used for path advertisement and withdrawal.

RFC 2858 assumes that any BGP speaker has an IPv4 address, which can be used in the aggregator attribute. Therefore, to enable BGP4 to support routing for multiple network layer protocols, the next hop and the NLRI were generically (as TLVs) inserted into a new MP_REACH_NLRI attribute. NLRI was also in a new MP_UNREACH_NLRI attribute. The former attribute is used to announce feasible routes, and the latter to withdraw unfeasible ones. Each of these attributes starts with an Address Family Identifier (AFI) and Subsequent Address Family Identifier (SAFI), to identify the network layer protocol. Both the next hop and the NLRI are variable-length fields, specified for each AFI/SAFI.

For IPv6 unicast (AFI:2, SAFI:1), the next-hop field is composed of a next-hop length, and one or two IPv6 addresses, as detailed in the "BGP Next Hop" section. The NLRI is one or several 2-tuples of the form <length, IPv6-prefix>. Note that IPv6 prefixes can also be found in other SAFI, such as multicast (SAFI:2), label (SAFI:4), or VPN (SAFI:127). Although the formats of the next hop might vary from one SAFI to another, as well as the NLRI (for SAFI:4, it is a 3-tuple <length, label, prefix>), the two attributes introduced by MP-BGP still work in all these cases.

MP-BGP extensions provide support for IPv6 through capability negotiation using the capability parameter of the OPEN message. During session establishment, the BGP peers negotiate capabilities as defined in RFC 2842. A BGP session could end up with many AFI/SAFI-negotiated capabilities, as shown in Example-20.

Example-20. BGP Neighbor Status
PE1#show bgp all neighbors
For address family: IPv4 Unicast
BGP neighbor is,  remote AS 100, internal link
  BGP version 4, remote router ID
  BGP state = Established, up for 00:07:04
  Last read 00:00:40, hold time is 180, keepalive interval is 60 seconds
  Neighbor capabilities:
    Route refresh: advertised and received(new)
    Address family IPv4 Unicast: advertised and received
    Address family IPv6 Unicast: advertised and received
    ipv6 MPLS Label capability: advertised and received
    Address family VPNv4 Unicast: advertised and received
    Address family VPNv6 Unicast: advertised and received

BGP itself works exactly the same way whether it is BGP4 or MP-BGP for a particular AFI/SAFI such as IPv6 unicast.

The interaction between BGP and the IGP running throughout the autonomous system, the scaling elements such as route reflectors, the distinction between interior and exterior peers, the route aggregation, the numerous BGP features, and so on are architectural elements that still apply to MP-BGP IPv6.

BGP Peering

In the most typical cases, BGP peering for announcing IPv6 routes will occur over an IPv6 transport, and eventually coexist with a separate BGP session for announcing IPv4 routes, as shown in Example-21.

Example-21. Using Distinct BGP Sessions for Address Families IPv4 and IPv6
router bgp 200
 neighbor remote-as 100
 neighbor 2001:100:1:1::1 remote-as 100
 address-family ipv4
 neighbor activate
 address-family ipv6
 neighbor 2001:100:1:1::1 activate

Note that although the BGP transport layer and address families advertised underneath deal with different network layer protocols, the ASNs are identical between IPv4 and IPv6. The deployment model implied by the preceding configuration example suggests that BGP operates independently for IPv4 and IPv6, with separate sessions over distinct transport layers. Although this is an optionattractive when IPv4 and IPv6 topologies are different because it provides the most flexibilityit is not mandated by BGP, and could bring additional configuration and operation complexity.

As mentioned earlier, MP-BGP runs over TCP. The protocol version (IPv4 or IPv6) used to establish the TCP session is independent of the address family being advertised. In fact, as shown in the example in the previous section (show bgp all neighbors), the same TCP (and BGP) session (over TCP IPv4 in the example) can transport multiple address families (for instance, IPv4 unicast and IPv6 unicast).

However, be aware of a couple of pitfalls while mixing transport and address families of different versions. The next hop advertised in the next-hop MP-BGP attribute is defaulted to the endpoint of the connection: You can easily understand that such a default cannot work when advertising an IPv4 AF over IPv6 or vice versa. The default must then be overwritten (for instance, by using a route-map command). Furthermore, BGP will try to synchronize the path (next hop) with the IGP: Even if the IPv6 update message has been distributed over TCP-v4, the BGP next hop must be routable, using the IGP running in the autonomous system. This is part of the verification performed by BGP, before electing a path as "best" and re-advertising to other peers. The "BGP Configuration Example" section shows two cases: IPv6 address family over IPv6 transport, and over IPv4 transport. With the limitations previously listed, the case of IPv4 address family announced over an IPv6 transport works, too.

BGP Next Hop

As already mentioned, the BGP next hop is either an IPv6 address of the eBGP peer sending the update, or in the iBGP case, the next hop is left unchanged while re-advertised.

Note: You can explicitly configure the iBGP peer to announce itself as the next hop (next-hop self). Another exception arises if the iBGP speaker did not receive a valid global next hop from its eBGP peer, which could happen when peering with it over link-locals. Finally, if the iBGP speaker is a 6PE, it must be on the labeled data path, to decode packets with labels that it owns: In that case, too, it will put itself as the next hop.

When the BGP IPv6 peers share a common subnet, the MP_REACH_NLRI attribute contains a link-local address next hop, in addition to the global address next hop. The link-local next hop is then used locally, whereas the global next hop is eventually re-advertised by BGP. As a consequence, a BGP speaker that advertises a route to an internal peer may modify the network address of next-hop field by removing the link-local IPv6 address of the next hop.

Using a link-local next hop to compute the routing interface for reaching a particular prefix proves especially useful with eBGP, when combined with peering over the link-local address. In case the peer changes its global address for whatever reason, neither the peer connection nor the next hop will be directly affected, and no forwarding hole should be seen. Of course, BGP entries will be updated (the renumbered global next hop should trigger new updates to be sent), but the connection should not be reset.

Example-22. Using Link-Local Address for BGP Peering
router bgp 200
 neighbor FE80::A8BB:CCFF:FE01:F600%Ethernet0/0 remote-as 100
 address-family ipv6
 neighbor FE80::A8BB:CCFF:FE01:F600%Ethernet0/0 activate
 neighbor FE80::A8BB:CCFF:FE01:F600%Ethernet0/0 route-map SETNH out
 redistribute connected
 no synchronization
route-map SETNH permit 10
 set ipv6 next-hop 2001:100:1:1::2

Note that the link-local address (highlighted) is followed by the interface to which it belongs. This is because link-local addresses are not guaranteed to be unique across interfaces of the router. A next hop with a global address is set explicitly (using route-map) so that it can be propagated by PE1 to PE2 over iBGP.

In the preceding example, the resulting entries in the routing table are as shown in Example-23.

Example-23. Displaying BGP Table Entries and Next Hop
CE1#show bgp ipv6
 BGP table version is 7, local router ID is
Status codes: s suppressed, d damped, h history, * valid, > best,
              i - internal, r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  2001:100:1:1::/64
                    2001:100:1:1::1          0             0 100 ?
*>                  ::                    0             32768 ?
*> 2001:100:2:1::/64
                    2001:100:1:1::1                        0 100 ?
*> 2001:100:3:1::/64
                    2001:100:1:1::1          0             0 100 ?
*> 2001:100:3:2::/64
                    2001:100:1:1::1          0             0 100 ?
*> 2001:100:3:3::/64
                    2001:100:1:1::1                        0 100 ?
*> 2001:100:3:4::/64
                    2001:100:1:1::1                        0 100 ?

While only the global next hop (2001:100:1:1::1) is displayed in this summary, a closer look at one of these entries will show the following.

Example-24. Details of One Specific BGP Entry's Next Hops
CE1#show bgp ipv6 2001:100:1:1::/64
BGP routing table entry for 2001:100:1:1::/64, version 71
Paths: (2 available, best #2, table default)
  Advertised to update-groups:
    2001:100:1:1::1 (FE80::A8BB:CCFF:FE01:F600) from
FE80::A8BB:CCFF:FE01:F600%Ethernet0/0 (
      Origin incomplete, metric 0, localpref 100, valid, external
    :: from (
     Origin incomplete, metric 0, localpref 100, weight 32768, valid, sourced, best

The link-local next hop, FE80::A8BB:CCFF:FE01:F600, is shown in parentheses, after the global next hop, 2001:100:1:1::1.

Note: Even though the eBGP speaker is peering over a link-local, and advertising a link-local next hop, it must also provide a global next hop to enable the eBGP (PE1) to propagate it to its iBGP peers. This global next hop must be explicitly set by a route map. In the absence of it, PE1 will announce one of its own global addresses and become the intermediate step router for all traffic to CE1.

BGP Configuration Example

The following BGP configuration extract shows how to configure the IPv6 address family, for both eBGP and iBGP.

Example-25. BGP Configuration Example
router bgp 100
 bgp log-neighbor-changes
 neighbor 2001:100:3:4::1 remote-as 100 !for iBGP peering, over IPv6
 neighbor remote-as 200 !for eBGP peering, over IPv4
 address-family ipv6
 neighbor 2001:100:3:4::1 activate
 neighbor activate
 neighbor route-map SETNH out
 redistribute connected
route-map SETNH permit 10
 set ipv6 next-hop 2001:100:3:1::1

In this example, the IPv6 address family is configured and activated toward an iBGP peer (with an IPv6 TCP endpoint 2001:100:3:4::1) as well as toward an eBGP peer (IPv4 TCP endpoint For the latter, the route-map statement sets explicitly the next hop, which is expected to be reachable from the peer, either using a default route, some IGP, or because peers are sharing a common subnet.

[Previous] [Contents] [Next]