mpls: pseudowire emulation / any tansport over mpls – interconnecting sites via layer 2 – (part 1)

Hi everybody!
Today I am going to introduce a new challenge that takes care of MPLS and pseudowire emulation.

Challenge:
– connect two sites via a MPLS VPN Backbone at layer 2

Given:
– a “SP” network without MPLS capabilities (yet), only layer 3 routing

Preconfigured:
– ip addresses of the core routers
– ISIS as routing protocol for routing the 172.16.x.x networks and the loopback interfaces

So lets start:
First of all we do a quick routing table lookup on the core routers to see if we have full rechability between the loopbacks of the cores.

CORE3#ping 1.1.1.1 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 1.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 3.3.3.3
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/75/124 ms
CORE3#ping 2.2.2.2 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:
Packet sent with a source address of 3.3.3.3
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 12/59/92 ms
CORE3#

CORE3#sh ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route

Gateway of last resort is not set

1.0.0.0/32 is subnetted, 1 subnets
i L1 1.1.1.1 [115/30] via 172.16.32.1, FastEthernet0/0
2.0.0.0/32 is subnetted, 1 subnets
i L1 2.2.2.2 [115/20] via 172.16.32.1, FastEthernet0/0
3.0.0.0/32 is subnetted, 1 subnets
C 3.3.3.3 is directly connected, Loopback0
172.16.0.0/30 is subnetted, 2 subnets
C 172.16.32.0 is directly connected, FastEthernet0/0
i L1 172.16.21.0 [115/20] via 172.16.32.1, FastEthernet0/0

Looks good. Next step is choosing the technology we want to use. In our case we will use “pseudowire emulation” or “any transport over mpls” or whatever it is called :).
That technology requires MPLS to be enabled within the service provider network. So we will do this.

NOTE: We dont need MP-BGP here, because PEM (pseudowire emulation) only needs traffic to be encaspulated into MPLS to be transported end to end. There is no need for exchanging VRF information over BGP extended communities.


CORE1(config-if)#mpls ip
CORE1(config-if)#
!
CORE2(config)#int fa0/0
CORE2(config-if)#mpls ip
CORE2(config-if)#int fa0/1
CORE2(config-if)#mpls ip
!
CORE3(config)#int fa0/0
CORE3(config-if)#mpls ip

On Core 2 we will do a lookup into the LDP neighbor table to see if the adjacencies have been built.

CORE2#sh mpls int
Interface IP Tunnel Operational
FastEthernet0/0 Yes (ldp) No Yes
FastEthernet0/1 Yes (ldp) No Yes
CORE2#sh mpls ldp neigh
Peer LDP Ident: 1.1.1.1:0; Local LDP Ident 2.2.2.2:0
TCP connection: 1.1.1.1.646 - 2.2.2.2.47369
State: Oper; Msgs sent/rcvd: 9/9; Downstream
Up time: 00:01:21
LDP discovery sources:
FastEthernet0/0, Src IP addr: 172.16.21.1
Addresses bound to peer LDP Ident:
172.16.21.1 1.1.1.1
Peer LDP Ident: 3.3.3.3:0; Local LDP Ident 2.2.2.2:0
TCP connection: 3.3.3.3.43937 - 2.2.2.2.646
State: Oper; Msgs sent/rcvd: 8/8; Downstream
Up time: 00:00:54
LDP discovery sources:
FastEthernet0/1, Src IP addr: 172.16.32.2
Addresses bound to peer LDP Ident:
172.16.32.2 3.3.3.3

Looks good! Messages have been exchanged between the machines.
Now we will create a “pseudowire class” to set the method (in our case MPLS) which should be used to encapsulate packets.

NOTE: we only need to perform this step on the ingress and egress routers (where the emulation takes place). All other routers only need to perform label switching.

CORE1(config)#pseudowire-class PWC-SITE-1-2
CORE1(config-pw-class)#encapsulation mpls
!
CORE3(config)#pseudowire-class PWC-SITE-1-2
CORE3(config-pw-class)#encapsulation mpls

Now we are going to use those ones with the “xconnect” command under the interfaces we want to have a Layer2 connection with each other. This will be fa0/1 of CORE1 and CORE3.
Under the interface level we also need to type in an ip address. Well kind of strange but that is atually the ip address of the peering router that has the other side of the “virtual L2 line” in his stack. What makes sense here is to enter the loopback interface of the corresponding machines.

CORE1(config-if-xconn)#int fa0/1
CORE1(config-if)#xconnect 3.3.3.3 12 pw-class PWC-SITE-1-2
!
CORE3(config-if-xconn)#int fa0/1
CORE3(config-if)#xconnect 1.1.1.1 12 pw-class PWC-SITE-1-2

What now is heading into our eyes is another LDP neighborship?!?!

*Mar 1 00:21:36.015: %LDP-5-NBRCHG: LDP Neighbor 1.1.1.1:0 (2) is UP

Well what the hell is this about? Well that is a targeted hello session that is needed to exchange pseudowire information between the two corresponding routers. Background to that matter is the following: The routers form a LDP adjacency via unicast. Reason for that is that they need to talk to each other about the MPLS header for the VC (the pseudowire virtual channel) as the packets are first sent from CORE1 to CORE3 via MPLS (the transport label) and then the transport label is stripped of. The second label then gives information about the corresponding VC.

The config is done right now! Now we gotta check if the VC (virtual channel) is up.
NOTE: Here you can also see the “MPLS VC Labels” that have been exchanged between CORE1 and CORE3. So dont wonder when you are sniffing the MPLS packets and see the labels 19 and 20 show up in your wireshark output and they dont when doing a “sh mpls ldp binding”.

CORE3#sh mpls l2transport vc det
Local interface: Fa0/1 up, line protocol up, Ethernet up
Destination address: 1.1.1.1, VC ID: 12, VC status: up
Next hop: 172.16.32.1
Output interface: Fa0/0, imposed label stack {18 19}
Create time: 00:08:01, last status change time: 00:07:59
Signaling protocol: LDP, peer 1.1.1.1:0 up
MPLS VC labels: local 20, remote 19
Group ID: local 0, remote 0
MTU: local 1500, remote 1500
Remote interface description:
Sequencing: receive disabled, send disabled
VC statistics:
packet totals: receive 57, send 57
byte totals: receive 6253, send 6253
packet drops: receive 0, seq error 0, send 0
!CORE1#sh mpls l2transport vc det
Local interface: Fa0/1 up, line protocol up, Ethernet up
Destination address: 3.3.3.3, VC ID: 12, VC status: up
Next hop: 172.16.21.2
Output interface: Fa0/0, imposed label stack {19 20}
Create time: 00:29:18, last status change time: 00:08:21
Signaling protocol: LDP, peer 3.3.3.3:0 up
MPLS VC labels: local 19, remote 20
Group ID: local 0, remote 0
MTU: local 1500, remote 1500
Remote interface description:
Sequencing: receive disabled, send disabled
VC statistics:
packet totals: receive 191, send 189
byte totals: receive 21087, send 20924
packet drops: receive 0, seq error 0, send 10

Nice! Seems to look good. We will now configure some ip addresses on the routers SITE1 and SITE2 to see if we can get an L2 adjacency.

SITE1(config)#int fa0/0
SITE1(config-if)#ip add 10.10.10.1 255.255.255.0
!
SITE2(config)#int fa0/0
SITE2(config-if)#ip add 10.10.10.2 255.255.255.0

Lets try a simple icmp.

SITE1#ping 10.10.10.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.10.10.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 72/131/200 ms
SITE1#sh arp
Protocol Address Age (min) Hardware Addr Type Interface
Internet 10.10.10.1 - c20e.1164.0000 ARPA FastEthernet0/0
Internet 10.10.10.2 27 c20d.1aac.0000 ARPA FastEthernet0/0

It succeeds. Well thats good :). Lets see if we can use other layer 2 protocols.

SITE1#sh cdp neigh
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
S - Switch, H - Host, I - IGMP, r - Repeater

Device ID Local Intrfce Holdtme Capability Platform Port ID
CORE1 Fas 0/0 127 R S I 3725 Fas 0/1
SITE2 Fas 0/0 142 R S I 3725 Fas 0/0

Ok thats enough for now! We can see that it works. But we will test something else now. A ping with the payload of 1500 (including ip header) between the two sites with the DF bit set.

SITE1#ping 10.10.10.2 size 1500 df

Type escape sequence to abort.
Sending 5, 1500-byte ICMP Echos to 10.10.10.2, timeout is 2 seconds:
Packet sent with the DF bit set
.....
Success rate is 0 percent (0/5)

FAIL! Thats not what we wanna see. We want to use 1500bytes of mtu but we cant. Well, the reason for that is the additional headers we have used to generate the PEM. Lets do some wireshark action here. We will sniff the icmp packets between CORE1 and CORE2 to see whats happening. First we will get some packets with the standard icmp size and not with 1500 DF.

Well there is the problem, we have a little more traffic to handle with the PEM.

1x MPLS Label for transporting the packet through the MPLS network to the PEM peer (4 Byte)
1x MPLS Label that is used at the peer for identifying the correct VC (4 Byte)
1x additional Ethernet Header. Well our source frame from SITE1 is 1518 bytes long. As the COMPLETE frame is encapsulated into two MPLS headers and we are using ethernet as transport technology here, means that there needs to be a new L2 header for transporting the frame from CORE1 to CORE2 and so on (18 Byte).

So this is the initial frame that SITE1 wants to send to SITE2 (Payload/required MTU is 1500):

And this is the packet when it was encapsulated by PEM and goes on its way into the backbone (New Payload/required MTU is now 1526 Byte):

Lets configure the new required MTU.

NOTE: When changing the MTU and within the backbone the MTU on some links is different, then you wont get an OSPF or ISIS adjacency as the MTU is part of the negotiation process for such relationships.

CORE1(config-if)#mtu 1526
CORE1(config-if)#
!
CORE2(config-if)#int fa0/0
CORE2(config-if)#mtu 1526
CORE2(config-if)#int fa0/1
CORE2(config-if)#mtu 1526
!
CORE3(config)#int fa0/0
CORE3(config-if)#mtu 1526

Give ISIS or OSPF some time to reconverge if needed and then you can head on with the ICMP testing.

SITE1#ping 10.10.10.2 rep 5 size 1500 df

Type escape sequence to abort.
Sending 5, 1500-byte ICMP Echos to 10.10.10.2, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 100/160/204 ms

It works!!!

Read the next part coming up where we will do the same with the exception that the provider network (CORE2) is not capable of MPLS :).

Have fun and feel free to comment!

Regards!
Markus

Advertisements

About markus.wirth

Living near Limburg in Germany, working as a Network Engineer around Frankfurt am Main.
This entry was posted in MPLS, Uncategorized and tagged , , , , , , , , , , , , , , , , . Bookmark the permalink.

2 Responses to mpls: pseudowire emulation / any tansport over mpls – interconnecting sites via layer 2 – (part 1)

  1. Tee says:

    Why don’t you use “mpls mtu 1526” instead of “mtu 1526” on the interface ?
    Thank you for great web.

  2. Tee says:

    OK I see. It’s L2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s