VRF based path selection
In this post I will be showing you how its possible to use different paths between your PE routers on a per VRF basis.
This is very useful if you have customers you want to “steer” away from your normal traffic flow between PE routers.
For example, this could be due to certain SLA’s.
I will be using the following topology to demonstrate how this can be done:
A short walkthrough of the topology is in order.
In the service provider core we have 4 routers. R3, XRv-1, XRv-2 and R4. R3 and R4 are IOS-XE based routers and XRv-1 and XRv-2 are as the name implies, IOS-XR routers. There is no significance attached to the fact that im running two XR routers. Its simply how I could build the required topology.
The service provider is running OSPF as the IGP, with R3 and R4 being the PE routers for an MPLS L3 VPN service. On top of that, LDP is being used to build the required LSP’s. The IGP has been modified to prefer the northbound path (R3 -> XRv-1 -> R4) by increasing the cost of the R3, XRv-2 and R4 to 100.
So by default, traffic between R3 and R4 will flow northbound.
We can easily verify this:
R3#traceroute 4.4.4.4 Type escape sequence to abort. Tracing the route to 4.4.4.4 VRF info: (vrf in name/id, vrf out name/id) 1 10.3.10.10 [MPLS: Label 16005 Exp 0] 16 msec 1 msec 1 msec 2 10.4.10.4 1 msec * 5 msec
And the reverse path is the same:
R4#traceroute 3.3.3.3 Type escape sequence to abort. Tracing the route to 3.3.3.3 VRF info: (vrf in name/id, vrf out name/id) 1 10.4.10.10 [MPLS: Label 16000 Exp 0] 3 msec 2 msec 0 msec 2 10.3.10.3 1 msec * 5 msec
Besides that traffic flow the desired way, we can see we are using label switching between the loopbacks. Exactly what we want in this type of setup.
On the customer side, we have 2 customers, Customer A and Customer B. Each of them has 2 sites, one behind R3 and one behind R4. Pretty simple. They are all running EIGRP between the CE’s and the PE’s.
Beyond this we have MPLS Traffic Engineering running in the service core as well. Specifically we are running a tunnel going from R3’s loopback200 (33.33.33.33/32) towards R4’s loopback200 (44.44.44.44/32). This has been accomplished by configuring an explicit path on both R3 and R4.
Lets verify the tunnel configuration on both:
On R3:
R3#sh ip expl PATH NEW-R3-TO-R4 (strict source route, path complete, generation 8) 1: next-address 10.3.20.20 2: next-address 10.4.20.4 R3#sh run int tunnel10 Building configuration... Current configuration : 180 bytes ! interface Tunnel10 ip unnumbered Loopback200 tunnel mode mpls traffic-eng tunnel destination 10.4.20.4 tunnel mpls traffic-eng path-option 10 explicit name NEW-R3-TO-R4 end
And on R4:
R4#sh ip expl PATH NEW-R4-TO-R3 (strict source route, path complete, generation 4) 1: next-address 10.4.20.20 2: next-address 10.3.20.3 R4#sh run int tun10 Building configuration... Current configuration : 180 bytes ! interface Tunnel10 ip unnumbered Loopback200 tunnel mode mpls traffic-eng tunnel destination 10.3.20.3 tunnel mpls traffic-eng path-option 10 explicit name NEW-R4-TO-R3 end
On top of that we have configured a static route on both R3 and R4, to steer traffic for each others loopback200’s down the tunnel:
R3#sh run | incl ip route ip route 44.44.44.44 255.255.255.255 Tunnel10 R4#sh run | incl ip route ip route 33.33.33.33 255.255.255.255 Tunnel10
Resulting in the following RIB’s:
R3#sh ip route 44.44.44.44 Routing entry for 44.44.44.44/32 Known via "static", distance 1, metric 0 (connected) Routing Descriptor Blocks: * directly connected, via Tunnel10 Route metric is 0, traffic share count is 1 R4#sh ip route 33.33.33.33 Routing entry for 33.33.33.33/32 Known via "static", distance 1, metric 0 (connected) Routing Descriptor Blocks: * directly connected, via Tunnel10 Route metric is 0, traffic share count is 1
And to test out that we are actually using the southbound path (R3 -> XRv-2 -> R4), lets traceroute between the loopbacks (loopback200):
on R3:
R3#traceroute 44.44.44.44 so loopback200 Type escape sequence to abort. Tracing the route to 44.44.44.44 VRF info: (vrf in name/id, vrf out name/id) 1 10.3.20.20 [MPLS: Label 16007 Exp 0] 4 msec 2 msec 1 msec 2 10.4.20.4 1 msec * 3 msec
and on R4:
R4#traceroute 33.33.33.33 so loopback200 Type escape sequence to abort. Tracing the route to 33.33.33.33 VRF info: (vrf in name/id, vrf out name/id) 1 10.4.20.20 [MPLS: Label 16008 Exp 0] 4 msec 1 msec 1 msec 2 10.3.20.3 1 msec * 3 msec
This verifies that we have our two unidirectional tunnels and that communication between the loopback200 interfaces flows through the southbound path using our TE tunnels.
So lets take a look at the very simple BGP PE configuration on both R3 and R4:
R3:
router bgp 100 bgp log-neighbor-changes no bgp default ipv4-unicast neighbor 4.4.4.4 remote-as 100 neighbor 4.4.4.4 update-source Loopback100 ! address-family ipv4 exit-address-family ! address-family vpnv4 neighbor 4.4.4.4 activate neighbor 4.4.4.4 send-community extended exit-address-family ! address-family ipv4 vrf A redistribute eigrp 100 exit-address-family ! address-family ipv4 vrf B redistribute eigrp 100 exit-address-family
and R4:
router bgp 100 bgp log-neighbor-changes no bgp default ipv4-unicast neighbor 3.3.3.3 remote-as 100 neighbor 3.3.3.3 update-source Loopback100 ! address-family ipv4 exit-address-family ! address-family vpnv4 neighbor 3.3.3.3 activate neighbor 3.3.3.3 send-community extended exit-address-family ! address-family ipv4 vrf A redistribute eigrp 100 exit-address-family ! address-family ipv4 vrf B redistribute eigrp 100 exit-address-family
From this output, we can see that we are using the loopback100 interfaces for the BGP peering. As routing updates comes in from one PE, the next-hop will be set to the remote PE’s loopback100 interface. This will then cause the transport-label to be one going to this loopback100 interface.
A traceroute from R1’s loopback0 interface to R5’s loopback0 interface, will show us the path that traffic between each site in VRF A (Customer A) will take:
R1:
R1#traceroute 5.5.5.5 so loo0 Type escape sequence to abort. Tracing the route to 5.5.5.5 VRF info: (vrf in name/id, vrf out name/id) 1 10.1.3.3 1 msec 1 msec 0 msec 2 10.3.10.10 [MPLS: Labels 16005/408 Exp 0] 6 msec 1 msec 10 msec 3 10.4.5.4 [MPLS: Label 408 Exp 0] 15 msec 22 msec 17 msec 4 10.4.5.5 18 msec * 4 msec
and lets compare that to what R3 will use as the transport label to reach R4’s loopback100 interface:
R3#sh mpls for Local Outgoing Prefix Bytes Label Outgoing Next Hop Label Label or Tunnel Id Switched interface 300 Pop Label 10.10.10.10/32 0 Gi1.310 10.3.10.10 301 Pop Label 10.4.10.0/24 0 Gi1.310 10.3.10.10 302 Pop Label 20.20.20.20/32 0 Gi1.320 10.3.20.20 303 16004 10.4.20.0/24 0 Gi1.310 10.3.10.10 304 [T] Pop Label 44.44.44.44/32 0 Tu10 point2point 305 16005 4.4.4.4/32 0 Gi1.310 10.3.10.10 310 No Label 1.1.1.1/32[V] 2552 Gi1.13 10.1.3.1 311 No Label 10.1.3.0/24[V] 0 aggregate/A 312 No Label 2.2.2.2/32[V] 2552 Gi1.23 10.2.3.2 313 No Label 10.2.3.0/24[V] 0 aggregate/B
We can see that this matches up being 16005 (going to XRv-1) through the northbound path.
This begs the question, how do we steer our traffic through the southbound path using the loopback200 instead, when the peering is between loopback100’s?
Well, thankfully IOS has it covered. Under the VRF configuration for Customer B (VRF B), we have the option of setting the loopback interface of updates sent to the remote PE:
On R3:
vrf definition B rd 100:2 ! address-family ipv4 route-target export 100:2 route-target import 100:2 bgp next-hop Loopback200 exit-address-family
and the same on R4:
vrf definition B rd 100:2 ! address-family ipv4 route-target export 100:2 route-target import 100:2 bgp next-hop Loopback200 exit-address-family
This causes the BGP updates to contain the “correct” next-hop:
R3:
R3#sh bgp vpnv4 uni vrf B | beg Route Dis Route Distinguisher: 100:2 (default for vrf B) *> 2.2.2.2/32 10.2.3.2 130816 32768 ? *>i 6.6.6.6/32 44.44.44.44 130816 100 0 ? *> 10.2.3.0/24 0.0.0.0 0 32768 ? *>i 10.4.6.0/24 44.44.44.44 0 100 0 ?
44.44.44.44/32 being the loopback200 of R4, and on R4:
R4#sh bgp vpnv4 uni vrf B | beg Route Dis Route Distinguisher: 100:2 (default for vrf B) *>i 2.2.2.2/32 33.33.33.33 130816 100 0 ? *> 6.6.6.6/32 10.4.6.6 130816 32768 ? *>i 10.2.3.0/24 33.33.33.33 0 100 0 ? *> 10.4.6.0/24 0.0.0.0 0 32768 ?
Lets check out whether this actually works or not:
R2#traceroute 6.6.6.6 so loo0 Type escape sequence to abort. Tracing the route to 6.6.6.6 VRF info: (vrf in name/id, vrf out name/id) 1 10.2.3.3 1 msec 1 msec 0 msec 2 10.3.20.20 [MPLS: Labels 16007/409 Exp 0] 4 msec 1 msec 10 msec 3 10.4.6.4 [MPLS: Label 409 Exp 0] 15 msec 16 msec 17 msec 4 10.4.6.6 19 msec * 4 msec
Excellent! – We can see that we are indeed using the southbound path. To make sure we are using the tunnel, note the transport label of 16007, and compare that to:
R3:
R3#sh mpls traffic-eng tun tunnel 10 Name: R3_t10 (Tunnel10) Destination: 10.4.20.4 Status: Admin: up Oper: up Path: valid Signalling: connected path option 10, type explicit NEW-R3-TO-R4 (Basis for Setup, path weight 200) Config Parameters: Bandwidth: 0 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF Metric Type: TE (default) AutoRoute: disabled LockDown: disabled Loadshare: 0 [0] bw-based auto-bw: disabled Active Path Option Parameters: State: explicit path option 10 is active BandwidthOverride: disabled LockDown: disabled Verbatim: disabled InLabel : - OutLabel : GigabitEthernet1.320, 16007 Next Hop : 10.3.20.20
I have deleted alot of non-relevant output, but pay attention to the Outlabel, which is indeed 16007.
So that was a quick walkthrough of how easy it is to accomplish the stated goal once you know about that nifty IOS command.
I hope its been useful to you.
Take Care!