bgp race condition problem

Hello everybody!
Today I am going to show a phenomenon that is called bgp race condition.

Topology is quite simple, we have two routers that are directly connected and they will form an eBGP neighbor relationship that is sourced by each loopback address of the router.

OSPF is activated on the physical links and ont the lo0 of the routers and they are fully adjacent within area 0.

We configure a eBGP session between the routers:

R1#conf t
 R1(config)#router bgp 1
 R1(config-router)#no auto
 R1(config-router)#no sync
 R1(config-router)#neighbor 2.2.2.2 remote-as 2
 R1(config-router)#neighbor 2.2.2.2 update-source lo0
R2#conf t
 R2(config)#router bgp 2
 R2(config-router)#no auto
 R2(config-router)#no sync
 R2(config-router)#neighbor 1.1.1.1 remote-as 1
 R2(config-router)#neighbor 1.1.1.1 update-source lo0

As we will see the bgp neighborship will not come up! Why?
Well as the routers try to start the BGP adjacency from their loopbacks they are not directly connected.
So we need to configure the bgp option “disable-connected-check”. This option makes directly connected neighbors to build an adjacency even if they connect via their loopbacks.

R1(config-router)#neighbor 2.2.2.2 disable-connected-check

R2(config-router)#neighbor 1.1.1.1 disable-connected-check

And the BGP adjacency will form.

R1:
 *Mar 1 01:59:28.559: %BGP-5-ADJCHANGE: neighbor 2.2.2.2 Up

R2:
 *Mar 1 01:59:28.707: %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Up

Now we are going to advertise the lo10 of each router into bgp to see if it works fine.

R1(config-router)#network 10.10.10.10 mask 255.255.255.255

R2(config-router)#network 20.20.20.20 mask 255.255.255.255

Verifying on R2:

R2#sh ip bgp
 BGP table version is 167, local router ID is 2.2.2.2
 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
 r RIB-failure, S Stale
 Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
 *> 10.10.10.10/32 1.1.1.1 0 0 1 i
 *> 20.20.20.20/32 0.0.0.0 0 32768 i

Verifying n R1:

 R1#sh ip bgp
 BGP table version is 172, local router ID is 1.1.1.1
 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
 r RIB-failure, S Stale
 Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
 *> 10.10.10.10/32 0.0.0.0 0 32768 i
 *> 20.20.20.20/32 2.2.2.2 0 0 2 i

Looks good.
No we come to the “race condition”. We will now also advertise the lo0 of each router into BGP. Very many people might do this as they say “Hey its good to have lo0 in it here”.

R1(config-router)#network 1.1.1.1 mask 255.255.255.255

R2(config-router)#network 2.2.2.2 mask 255.255.255.255

Taking a look into the bgp table first looks good.

R2#sh ip bgp
 BGP table version is 175, local router ID is 2.2.2.2
 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
 r RIB-failure, S Stale
 Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
 *> 1.1.1.1/32 1.1.1.1 0 0 1 i
 *> 2.2.2.2/32 0.0.0.0 0 32768 i
 *> 10.10.10.10/32 1.1.1.1 0 0 1 i
 *> 20.20.20.20/32 0.0.0.0 0 32768 i

R1#sh ip bgp
 BGP table version is 176, local router ID is 1.1.1.1
 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
 r RIB-failure, S Stale
 Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
 *> 1.1.1.1/32 0.0.0.0 0 32768 i
 *> 2.2.2.2/32 2.2.2.2 0 0 2 i
 *> 10.10.10.10/32 0.0.0.0 0 32768 i
 *> 20.20.20.20/32 2.2.2.2 0 0 2 i

But what we are going to see after some time is that the BGP adjacency will tear down!

R1#
 *Mar 1 02:25:15.855: %BGP-5-ADJCHANGE: neighbor 2.2.2.2 Down BGP Notification sent
 R1#
 *Mar 1 02:25:15.855: %BGP-3-NOTIFICATION: sent to neighbor 2.2.2.2 4/0 (hold time expired) 0 bytes
R2#
 *Mar 1 02:25:16.359: %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Down BGP Notification sent
 R2#
 *Mar 1 02:25:16.359: %BGP-3-NOTIFICATION: sent to neighbor 1.1.1.1 4/0 (hold time expired) 0 bytes

Some seconds after that log message the BGP session comes up again.

R2#
 *Mar 1 02:25:45.999: %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Up
R1#
 *Mar 1 02:25:45.991: %BGP-5-ADJCHANGE: neighbor 2.2.2.2 Up

When you take a look into the routing table you can see the problem. Usually you should see the loopback of the peering router known via the IGP (OSPF in that case).

R1#sh ip route 2.2.2.2
 Routing entry for 2.2.2.2/32
 Known via "ospf 1", distance 110, metric 11, type intra area
 Last update from 172.16.21.2 on FastEthernet0/0, 00:00:15 ago
 Routing Descriptor Blocks:
 * 172.16.21.2, from 172.16.21.2, 00:00:15 ago, via FastEthernet0/0
 Route metric is 11, traffic share count is 1

When the BGP session comes up you see it know via BGP because the AD of eBGP is 20 and that is lower than 110

R1#sh ip route 2.2.2.2
 Routing entry for 2.2.2.2/32
 Known via "bgp 1", distance 20, metric 0
 Tag 2, type external
 Last update from 2.2.2.2 00:00:42 ago
 Routing Descriptor Blocks:
 * 2.2.2.2, from 2.2.2.2, 00:00:42 ago
 Route metric is 0, traffic share count is 1
 AS Hops 1
 Route tag 2

And this is the problem here. We know the loopback of R2 which is 2.2.2.2 via the loopback of R2. This cannot work. In case of the BGP advertising the route to 2.2.2.2/32 thats problematic as BGP here sets its BGP peer address as next-hop.
So this route recursion points to the fact that the loopback addresses of the routers cannot reach each other and the BGP session times out.
Solution for that:
– Never advertise the update-source of a BGP session into BGP itself, as BGP is not capable of providing the transport for itself.

Feel free to comment and have a nice day!

Advertisements

About markus.wirth

Living near Limburg in Germany, working as a Network Engineer around Frankfurt am Main.
This entry was posted in BGP and tagged , , , , , , , , , , , . Bookmark the permalink.

4 Responses to bgp race condition problem

  1. minds-eye says:

    Thanks for this. I was under the impression that a race condition was caused by learning a peer address through *ANY* bgp session (i.e. the underlying connectivity between two peers must be a true IGP/connected/static route). Seems this is not the case. In my organisation, our remote networks are connected via provider MPLS/eBGP cloud and I was doubtful that we could run iBGP between our “CE” routers over this. It appears though that iBGP sessions can use eBGP as the underlying “IGP”. Weird!

  2. markus.wirth says:

    Yes of course you can use eBGP as a routing protocol between diffferent routers ant then iBGP with their loopbacks. When you exchange the loopbacks (or whatever addresses that are the update-source for the iBGP session) via eBGP thats okay, because eBGP has a lower AD than iBGP. But be careful with the network command here, when you enable the network command under your bgp process it will inject the routes into iBGP and eBGP.
    Maybe you could lab it up with gns3?

  3. Akash says:

    What a wonderful post this is…..just excellent….i spent almost a day solving this scenario but couldnt find the reason for peering loopbacks loosing the reachability to each other and the time-out of ebgp peering….. Thanx for the post once again……

  4. Adel Alkhafaji says:

    Good Explanation, Thanks I look for more information in this site

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s