How to Troubleshoot L2TP User Login Failures on NE40E?

Issue Description

L2TP Users failed to go online from an NE40E device that function as a LAC, the LNS is a third-party device.

Alarm Information

None

Handling Process

Step 1 Using the command ”display aaa online-fail-record statistic” and ”display aaa online-fail-record” checked the reason of user login failures and found that CM with l2tp session fail was the top 1 failure reason. This reason indicated that L2TP session setup failed during the user login process.

//Top 5 login failure causes in every 10 minutes during the fault period (top 1 failure cause: CM with l2tp session):

[~Huawei-diagnose]display aaa online-fail-record statistics
  -------------------------------------------------------------------
  StartTime---EndTime
    Offline-Reason-ID Description :Count
  -------------------------------------------------------------------
 2020-08-21 15:40:24----------------------------2020-08-21 15:50:24
    54 CM with l2tp session fail :34597
    331 CM with PPP conn up time out :3448
    154 LNS cleared session :99
    597 Fail to process padr :6
    146 PPP negotiate fail :2

[~Huawei-diagnose] display aaa online-fail-record     
-------------------------------------------------------------------
  User name : [email protected]
  Domain name : huawei
  User MAC : fcxx-xxxx-xxxx
  User access type : PPPoE
  User interface : Eth-Trunk1.1
  User access PeVlan/CeVlan : 2/10
  User IP address : -
  User ID : 123456
  User authen state : Authened
  User acct state : AcctIdle
  User author state : AuthorIdle
  User login time : 2020-08-21 15:33:16
  Online fail reason : CM with l2tp session fail
  -------------------------------------------------------------------

 

The following figure shows the L2TP session setup process during user login.

Step 2 Using the command ”trace mac fcxx-xxxx-xxxx” traced the user login process. It was found that L2TP session setup failed mainly because no ICRP packets were received from the LNS and ICRQ packets failed to be sent due to a buffer overflow.
1. The debugging information shows that the LAC received no ICRP packets from the LNS after sending ICRQ packets.
August  21 2020 15:39:25.608+08:00 Huawei %%01BRASL2TP/7/BRL2TP_DBG_LAC_PACKET(d):CID=0x83960502;LAC::OUT ICRQ packet, Ns 47431, Nr 60620, len:137
2. The LAC failed to send ICRQ packets due to a buffer overflow.
August  21 2020 15:39:25.531+08:00 Huawei %%01BRASL2TP/7/BRL2TP_DBG_LAC_ERROR(d):CID=0x83960502;Send window overflow,fail to send ICRQ

Step 3 Why the send window overflow? Using the command ”display l2tp item tunnel-item 100″ checked the tunnel info, and found that the send window size for L2TP negotiation was 1. The L2TP packet order-preserving mechanism prevents the system from sending the next packet if no response packet is received after a packet is sent.
L2TP requires that packets be sent in order. The LNS and LAC negotiate the send window size based on their capabilities. The window size indicates the number of packets that can be received by the peer at a time. A packet is saved in the send window after being sent. The next packet can be sent only after the response to the packet is received. The LAC (NE40E) negotiated with the LNS to set the send window size to 1, which greatly reduced the packet sending rate. As a result, the send buffer was full and some ICRQ packets failed to be sent.
The receive window NE40E sent to LNS is 1024 and it is not configurable.

[~Huawei-diagnose]display l2tp item tunnel-item 100
  ---------------------------------------------------------
    LastSendNr :1234         AckTimeOut :1000     
    SessionSum :2345         SendWinUpperward :34567    
    SendWinLowerward :34568 RecvWinUpperward :4567    
    RecvWinLowerward :1234         DelayAckTimeOut :50      
    HelloInterval :60 Slot :0       
    RecvWinSize :128 SendWinSize :1       
    LocalUdpPort :1701 LocalTunnelID :501     
    RemoteTunnelID :15123        ReFirmWaVersion :4400    
    ReProVersion :256 RetryTimes :0       
    ClearTimes :0 LocalConnection :0       
    Authentication :1 HelloTimer :123456732
    DelayClearTimer :0 AckTimer :123456747
    DelayAckTimer :0 UserSecret :******
IdleCutTimer :0      
…………………………………………

Send window Size is determined by the receive window from LNS device, according to RFC 2661

Step 4 Analyzed the login failure when the send window size was 1. If the send queue was long, after receiving an ICRP packet from the LNS, the LAC might fail to respond with an ICCN packet before the timeout period for waiting for an ICCN packet from the LAC expired. As a result, user login failed.

Root Cause

L2TP user login failed because L2TP sessions failed to be set up between the LAC and LNS. The major reasons are as follows:
1. The LNS failed to send an ICRP packet in response to a received ICRQ packet.
2. The send window size of L2TP sessions was too small, causing low packet sending efficiency. As a result, L2TP packets were stacked on the LAC and packets were out-of-order.

Solution

1. Run the tunnel retrans-queue 1 command in the L2TP group view to optimize the user login efficiency.
2. Check the reason why the LNS failed to respond to ICRP packets.

Suggestions

This is an interconnection issue between Huawei router with 3rd party vendor router, the interconnection parameters are inconsistent after parameters automatically negotiation, so for interconnection issue, we need to focus on the parameter negotiation.

END

 

 

Tags