Warning of an ARP Entry Dually-Transmitting Failure

Keywords: ATN 950B, ARP entry dually-transmitting failure, mixed VPN solution

Summary: A timing sequence error occurs when software performs batch backup on an

ATN 950B NE functioning as a CSG running V200R002C00SPC300 or an earlier version.

After a master/slave MPU switchover is performed several times, there is a possibility that the

ARP entry dually-transmitting function fails. In this case, when network-to-user traffic transmitted

along an MPLS LDP LSP reaches the slave ASG in the mixed VPN solution, the ARP entry

dually-transmitting failure causes a service interruption.

[Problem Description]

Usage scenario:

 

In a mixed VPN solution, LDP LSPs are established on the access ring network. ATN 950B NEs

functioning as CSGs use the L2VPN function to transmit services and support the ARP entry

dually-transmitting function configured using the mpls l2vpn arp-dual-sending command

shown in the following table.

interface Ethernet0/3/0.1

mtu 2000

vlan-type dot1q 1

mpls l2vc 100.0.0.3 100 control-word raw//Configure the primary PW destined for ASG3.

mpls l2vc 100.0.0.4 101 control-word raw secondary//Configure the secondary PW destined for ASG4.

mpls l2vpn redundancy master//Configure PW redundancy protection to work in master/slave mode.

mpls l2vpn reroute delay 300//Set the delay for a PW switchback to 300s.

mpls l2vpn stream-dual-receiving//Configure the dually-receiving function for the primary and secondary PWs to prevent traffic loss during a traffic switchback.

mpls l2vpn arp-dual-sending //Configure ARP entry dually-transmitting function on both PWs to minimize traffic loss if a fault occurs on the primary PW.

Trigger conditions:

There is a high probability that the problem occurs if the following conditions are met. In the

laboratory, the probability is approximate 70%.

1. An MPU is reseated or the slave switchover command is run to perform a master/slave MPU

switchover twice. Alternatively, the slave MPU is reseated or a command is run to reset the slave MPU.

2. Another master/slave MPU switchover is performed.

Note that there is no limit on the interval between master/slave MPU switchovers, and the ATN 950B

NE must not be reset between master/slave MPU switchovers.

Symptom:

The slave ASG fails to learn the ARP entry mapped to the nodeB or eNodeB. As a result,

network-to-user traffic on the slave ASG is interrupted.

Identification method:

1. Check the tunnel token value of an L2VPN tunnel on the ATN 950B NE.

Run the display mpls l2vc interface interface-type interface-number command in the user view.

<HUAWEI>display mpls l2vc interface Ethernet0/3/0.1 

//In real-world situations, change Ethernet0/3/0.1 to the name of the actual interface that transmits L2VPN services.
*client interface       : Ethernet0/3/0.1 is up
Administrator PW       : no
session state          : up
AC status              : up
VC state               : up
Label state            : 0
Token state            : 0
VC ID                  : 1
VC type                : Ethernet
destination            : 172.1.1.178
local VC label         : 26           remote VC label      : 26
local AC OAM State     : up
local PSN OAM State    : up
link state             : up
local VC MTU           : 1500         remote VC MTU        : 1500
local VCCV             : cw alert ttl lsp-ping bfd
remote VCCV            : cw alert ttl lsp-ping bfd
local control word     : enable       remote control word  : enable
tunnel policy name     : —
PW template name       : —
primary or secondary   : primary
load balance type      : flow
Access-port            : false
Switchover Flag        : false
VC tunnel/token info   : 1 tunnels/tokens
NO.0  TNL type       : lsp   , TNL ID : 0x5  //0x5 is the token value used by the L2VPN tunnel. The token value is a hexadecimal number.
    Backup TNL type      : lsp   , TNL ID : 0x9

 

Check the MAC address carried in dually transmitted ARP packets.Run the display mplsada lsp

token 5 command in the diagnostic view. Note that 5 is the token value obtained in Step 1.

Note that when the problem occurs, the destination MAC address carried in dually transmitted ARP

packets is all 0s.

<HUAWEI>system-view

[HUAWEI]diagnose

[HUAWEI-diagnose]display  mplsada lsp token  5 //5 is the tunnel token value obtained in Step 1. This tunnel token value is a decimal number.

ulPdtHandle                     : 434410400
Protect Type          :3
Lsp or gre index      :5
Protect groupid       :1
ApsId                 :515
Primary pdthandle     :0x19e493a0
Backup pdthandle      :0x19df16d8

Nhi list begin
Nhi list end.  List count  :2

Main product info(TunnelId 5):
Ovid                  :0
MacIndex              :7
OutIfIndex            :337
TrunkorTp             :14
Trunk flag            :0
Nexthop               :0xc8010102
TunnelId              :5
Mac fake flag         :0
Exp Mode              :0
Exp value             :0
TnlType               :0
DestMac               :00:00:00:00:00:00 //The DestMac field indicates the destination MAC address for dually transmitted ARP packets. When the problem occurs, the destination MAC address is all 0s.

SrcMac                :f4:a1:a2:cf:33:01
Egress Intf index     :138
Mpls Tunnel index     :8
LspId or GreId        :4294967295
FqId                  :0
Dsid                  :0
Out stattisticId      :4294967295

 

[Root Cause]

1. When an ATN 950B is running properly, its master MPU is in slot 7, and the slave MPU is in slot 8.

After a master/slave MPU switchover is performed by reseating an MPU or running commands, the

MPU in slot 7 becomes the slave one, and the MPU in slot 8 becomes the master one. The MPU in

slot 7 synchronizes data in a batch with the MPU in slot 8. Due to a timing sequence error, LDP

data is backed up earlier than ARP data. As a result, the destination MAC address on the MPU in

slot 7 is all 0s. Although the destination MAC address is incorrect, the MPU in slot 7 does not

transmit services, and therefore, services are not affected.

2. After the other master/slave MPU switchover is performed, the MPU in slot 7 becomes the

master one. This MPU forwards dually transmitted ARP packets carrying the destination MAC

address of all 0s. Upon receipt, the next-hop device considers the packets incorrect and discards

them. As a result, the slave ASG fails to learn the ARP entry.

[Impact and Risk]

When network-to-user traffic arrives at the slave ASG, traffic is interrupted.

[Measures and Solutions]

Recovery measures:

Note that the recovery measures will adversely affect services transmitted on the ingress,

transitnode, and egress along an existing tunnel within seconds.

Run the reset mpls ldp all command in the user view to reset MPLS LDP.

<HUAWEI> reset mpls ldp all

 

Workarounds:

Solutions:

Perform either of the following steps to resolve the problem:

l Install the patch V200R002SPH006 or later on the ATN 950B NEs running V200R002C00SPC300.

l Upgrade a version earlier than V200R002C00SPC300 to V200R002C00SPC300 and then install the patch V200R002SPH006 or later.

Categories:

Tags:

Comments are closed