Sunday, November 20, 2016

CCIE SPv4 - MPLS - LDP Synchronization

Software versions:
IOS XE 15.5
IOS XR 5.3

The topology for this demo:
In modern networks today, circa 2016, IGP converges pretty quickly in comparison to yesteryear. Simply meaning that if you have and IGP that can re-route around issues or converge faster than say, LDP, can converge, that mean you could have forwarding issues. Case in point, if you have OSPF running, which we do, which can converge quickly, may converge faster than LDP can and begin forwarding traffic via IP and not LDP. That can break an L3 or L2 VPN. 

To remedy this potential issue, MPLS LDP synchronization is used to prevent that form happening. What basically happens, when enabled, is IGP and LDP work in tandem, IGP will check with LDP to see if it's converged, if it is, then label forwarding is on. If there not in sync, not converged, then LDP will signal IGP to advertise a max-metric on that particular link so that IGP will be forced to route around it.  The only issue we can run into with this, and it's the reason why we want it enabled is that IGP may be required on a link that LDP would use to form an LSP over. Chicken before the egg, cart before the horse, pants before your underwear, LDP simply can't work without IGP already running. 

So let's take a use case, R3 and R1 need to be able to talk to each other. I know it wouldn't be effective to have R1 as a PE, but just go with it. Pretend we may use PE to P MPLS TE tunnels later ok. We'll get LDP sync working on the left 6 routers, R1, 3, 4, XR1, 4 and 5. We need to make it look like IGP is working but LDP will not, whether a static route to null0, an ACL blocking TCP/UDP 646 or LDP being functional but not IGP with a passive interface for OSPF. We need to have IGP up and LDP down. 

First let's verify that R3 and in fact reach R1 via MPLS.
R3#traceroute 192.168.1.1 source lo0 num
Type escape sequence to abort.
Tracing the route to 192.168.1.1
VRF info: (vrf in name/id, vrf out name/id)
  1 10.3.4.4 [MPLS: Label 30 Exp 0] 2 msec 1 msec 1 msec
  2 10.15.4.15 1 msec 1 msec 4 msec
  3 10.1.15.1 13 msec *  3 msec

Alright, now, we need to enable MPLS LDP Sync and get that functional, then break LDP then IGP and see what this all looks like. 

IOS
router ospf 1
 mpls ldp sync

XR
router ospf 1
 area 0
  mpls ldp sync

Let's verify it from R1.
R1#sh mpls ldp igp sync
    GigabitEthernet1.111:
        LDP configured; LDP-IGP Synchronization enabled.
        Sync status: sync achieved; peer reachable.
        Sync delay time: 0 seconds (0 seconds left)
        IGP holddown time: infinite.
        Peer LDP Ident: 192.168.1.11:0
        IGP enabled: OSPF 1
    GigabitEthernet1.115:
        LDP configured; LDP-IGP Synchronization enabled.
        Sync status: sync achieved; peer reachable.
        Sync delay time: 0 seconds (0 seconds left)
        IGP holddown time: infinite.
        Peer LDP Ident: 192.168.1.15:0
        IGP enabled: OSPF 1

You'll see LDP configured and sync achieved, meaning ldp sync was done under the OSPF process and both IGP and LDP sync correctly. 

Let's check this from R3.
R3#show mpls ldp igp sync
    GigabitEthernet1.143:
        LDP configured; LDP-IGP Synchronization enabled.
        Sync status: sync not achieved; peer reachable.
        Sync delay time: 0 seconds (0 seconds left)
        IGP holddown time: infinite.
        IGP enabled: OSPF 1
    GigabitEthernet1.34:
        LDP configured; LDP-IGP Synchronization enabled.
        Sync status: sync achieved; peer reachable.
        Sync delay time: 0 seconds (0 seconds left)
        IGP holddown time: infinite.
        Peer LDP Ident: 192.168.1.4:0
        IGP enabled: OSPF 1

The connection from R3 to XR4 is not sync'd, due to the R3 side being admin'd down. No shutting this interface will fix this. 

Now we need to break LDP so that we'll see the result, on R3 originally we trace to R1 and went out towards R4, let's break that connection. We'll configure a static route to null0

R3(config)#ip route 192.168.1.4 255.255.255.255 null 0
R3(config)#end
R3#clear mpls ldp neighbor 192.168.1.3
R3#
Nov 20 01:23:15.920: %LDP-5-CLEAR_NBRS: Clear LDP neighbors (192.168.1.4) by console

Nov 20 01:23:15.921: %LDP-5-NBRCHG: LDP Neighbor 192.168.1.4:0 (1) is DOWN (User cleared session manually)

Let's see if that broke our connection.
R3#show mpls ldp igp sync
    GigabitEthernet1.34:
        LDP configured; LDP-IGP Synchronization enabled.
        Sync status: sync not achieved; peer reachable.
        Sync delay time: 0 seconds (0 seconds left)
        IGP holddown time: infinite.
        IGP enabled: OSPF 1

OK, well that's broke now. Let's check the RIB and FIB to see if that affected anything. 

R3#sho ip cef 192.168.1.1
192.168.1.1/32
  nexthop 10.14.3.14 GigabitEthernet1.143 label 24020

R3#sh ip route 192.168.1.1
Routing entry for 192.168.1.1/32
  Known via "ospf 1", distance 110, metric 4, type intra area
  Last update from 10.14.3.14 on GigabitEthernet1.143, 00:07:40 ago
  Routing Descriptor Blocks:
  * 10.14.3.14, from 192.168.1.1, 00:07:40 ago, via GigabitEthernet1.143
      Route metric is 4, traffic share count is 1

R3#traceroute 192.168.1.1 source lo0 num
Type escape sequence to abort.
Tracing the route to 192.168.1.1
VRF info: (vrf in name/id, vrf out name/id)
  1 10.14.3.14 [MPLS: Label 24020 Exp 0] 5 msec 3 msec 3 msec
  2 10.11.14.11 [MPLS: Label 24000 Exp 0] 4 msec 3 msec 3 msec
  3 10.1.11.1 10 msec

As you can see it really didn't do anything to reachability, beyond change our forwarding path.
Let's now disable IGP sync on the interface towards R4 and shutdown the link towards XR4, we should follow a native IP path to R1.

R3(config-subif)#no mpls ldp igp sync
R3(config)#int g1.143
R3(config-subif)#shut

R3#show mpls ldp igp sync
    GigabitEthernet1.34:
        LDP configured; LDP-IGP Synchronization not enabled.

R3#traceroute 192.168.1.1 source lo0 num
Type escape sequence to abort.
Tracing the route to 192.168.1.1
VRF info: (vrf in name/id, vrf out name/id)
  1 10.3.4.4 2 msec 1 msec 0 msec
  2 10.15.4.15 2 msec 0 msec 1 msec
  3 10.1.15.1 2 msec *  3 msec

To really see this command in action, we will debug the process. We also need to flap G1.34.
R3#debug mpls ldp igp sync

This output indicates that the G1.34 interface was enabled, but the peer isn't reachable
LDP-SYNC: Gi1.34: queue swif_updown, set INTFADDR_PENDING.
LDP-SYNC: Gi1.34: process swif_updown, clear INTFADDR_PENDING.
LDP-SYNC: Gi1.34, 192.168.1.4: Peer unreachable; set LDP_CTX_HANDLE_ROUTEUP
LDP-SYNC: Gi1.34, OSPF 1: notify status (required, not achieved, no delay, holddown infinite) internal status (not achieved, timer not running)

This output indicates that OSPF came up and the peer is reachable, but LDP has not coverged.
Nov 20 01:36:39.333: %OSPF-5-ADJCHG: Process 1, Nbr 192.168.1.4 on GigabitEthernet1.34 from LOADING to FULL, Loading Done
LDP-SYNC: Peer became reachable on Gi1.34, peer 192.168.1.4
LDP-SYNC: Gi1.34, OSPF 1: notify status (required, not achieved, delay, holddown infinite) internal status (not achieved, timer not running)
ldp: Check routeup needs; clear routeup needs
LDP-SYNC: Gi1.34: No session or session has not send initial update, ignore adj joining event.

This output indicates that LDP has peered and that sync has been achieved
Nov 20 01:36:54.356: %LDP-5-NBRCHG: LDP Neighbor 192.168.1.4:0 (1) is UP
LDP-SYNC: Gi1.34: session 192.168.1.4:0 came up, sync_achieved up

LDP-SYNC: Gi1.34, OSPF 1: notify status (required, achieved, no delay, holddown infinite) internal status (achieved, timer not running)

We will flap the link one more time, this time looking at the OSPF database and see what R3 advertises as a metric to other peers. We'll keep the debug running to keep track of status and also look at the database. The interface is shut and then unshut.

LDP-SYNC: Gi1.34: queue swif_updown, set INTFADDR_PENDING.
LDP-SYNC: Gi1.34: process swif_updown, clear INTFADDR_PENDING.
LDP-SYNC: Gi1.34, 192.168.1.4: Peer unreachable

Link connected to: a Stub Network
     (Link ID) Network/subnet number: 10.3.4.0
     (Link Data) Network Mask: 255.255.255.0
      Number of MTID metrics: 0

       TOS 0 Metrics: 1

%OSPF-5-ADJCHG: Process 1, Nbr 192.168.1.4 on GigabitEthernet1.34 from LOADING to FULL, Loading Done

Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.3.4.4
     (Link Data) Router Interface address: 10.3.4.3
      Number of MTID metrics: 0

       TOS 0 Metrics: 65535

LDP-SYNC: Peer became reachable on Gi1.34, peer 192.168.1.4
LDP-SYNC: Gi1.34, OSPF 1: notify status
%LDP-5-NBRCHG: LDP Neighbor 192.168.1.4:0 (1) is UP

Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.3.4.4
     (Link Data) Router Interface address: 10.3.4.3
      Number of MTID metrics: 0

       TOS 0 Metrics: 1

LDP-SYNC: Gi1.34: session 192.168.1.4:0 came up, sync_achieved up
LDP-SYNC: Gi1.34, OSPF 1: notify status

You can see that between the OSPF peering coming up and the LDP peering coming that the self-originated metric was the max of 65535 which is expected, then the when LDP came up, sync was achieved the OSPF was notified. 

There are a couple minor features, like the hold time and a delay start. 

The hold time by default is set to infinite, which is a good value so that there is never a sync issue. This can be modified. If we modify this to something small, like 5 seconds, then IGP will advertise a max metric for 5 seconds and then start forwarding. 

R3(config)#mpls ldp igp sync holddown 5000
R3(config)#
LDP-SYNC: Gi1.34, OSPF 1: notify status (required, achieved, no delay, holddown 5000) internal status (achieved, timer not running)

The ldp ipg sync delay timer tells LDP how long to wait before stating a link sync'd after the link comes back up, by default it's set to 0 seconds but can be configured between 5 and 60 seconds.

R3(config-subif)#mpls ldp igp sync delay 15

Nov 20 02:04:02.573: LDP-SYNC: Gi1.34, OSPF 1: notify status (required, not achieved, no delay, holddown 5000) internal status (not achieved, timer not running)
Nov 20 02:04:07.102: %OSPF-5-ADJCHG: Process 1, Nbr 192.168.1.4 on GigabitEthernet1.34 from LOADING to FULL, Loading Done
Nov 20 02:04:11.343: LDP-SYNC: Peer became reachable on Gi1.34, peer 192.168.1.4
Nov 20 02:04:11.343: LDP-SYNC: Gi1.34, OSPF 1: notify status (required, not achieved, delay, holddown 5000) internal status (not achieved, timer not running)
Nov 20 02:04:11.344: ldp: Check routeup needs; clear routeup needs
Nov 20 02:04:20.194: LDP-SYNC: Gi1.34: No session or session has not send initial update, ignore adj joining event.
Nov 20 02:04:20.194: %LDP-5-NBRCHG: LDP Neighbor 192.168.1.4:0 (1) is UP
Nov 20 02:04:20.194: LDP-SYNC: Gi1.34: session 192.168.1.4:0 came up, sync_achieved up
Nov 20 02:04:20.194: LDP-SYNC: Gi1.34: Delay notifying IGP of sync achieved for 15 seconds
Nov 20 02:04:35.194: LDP-SYNC: Gi1.34: Delay timer expired, notify IGP of sync achieved
Nov 20 02:04:35.194: LDP-SYNC: Gi1.34, OSPF 1: notify status (required, achieved, no delay, holddown 5000) internal status (achieved, timer not running)

Thanks for stopping by!
Rob Riker, CCIE #50693

No comments:

Post a Comment