OSN9500

How to Solve the OptiX OSN 9500 (Cross-Connect Board) Service Interruption Problem

Abstract
When the SCC board delivers the information about synchronous configuration interruption events for two consecutive times within a short time, the working and protection pages of the cross-connect chip on the cross-connect board may be inconsistent with each other in certain cases. As a result, the services are interrupted, and the downstream equipment reports the HP_UNEQ or AU_AIS or AU_LOP alarm.

OSN9500

[Problem Description]
Trigger Condition:
During the service configuration or protection switching, the problem may occur.
In the networking environment of higher order revertive SNCP schemes, the problem may occur more frequently if multiple SNCP schemes recover simultaneously.
Problem Phenomena:
After the service configuration or protection switching is performed on a site, certain services unrelated to the configuration or switching are interrupted, and the downstream sites report the HP_UNEQ or AU_AIS or AU_LOP alarm.
Judge Method:
1. The version of the board is listed in the Related Version part.
2. At the upstream NE that first reports the HP_UNEQ alarm in the service path, run the following commands for each higher order cross-connect board::optp:bid,0,1b,1,1,70,1,0,0,54,65,73,74,0
:optp:bid,0,41,1,ff,28,0
:optp:bid,0,41,1,ff,28,1
:optp:bid,0,1b,1,1,70,0,0,0,54,65,73,74,0

Here, bid indicates the slot ID of the cross-connect board. The value of bid is in hexadecimal, such as 0x29, 0x2a, 0x2b, and 0x2c.
If the output result of command 2 or 3 is 01, as shown in the following example, it indicates that the problem occurs.
For example:

:optp:29,0,41,1,ff,28,0
Optp cmd : ff28
01
Total records :1
#9-144:szhw [][][2008-12-09 14:03:17]>

The preceding information indicates that the cross-connect board in slot 41 encounters the problem.

[Cause Analysis]
Each interruption event on the cross-connect board is not controlled by priority. Therefore, the processing of synchronous configuration interruption events is delayed. When the synchronous configuration interruption event of the cross-connect board is delayed, the cross-connect chip still switches the working and protection pages if it receives the next synchronous configuration interruption event. If the cross-connect chip performs the switching for two times, the software of the cross-connect board writes the same page when it synchronizes the protection page. As a result, the working page is inconsistent with the protection page. The software of the cross-connect board does not check the consistency of the working and protection physical pages. Therefore, it cannot find the potential problem and perform self-healing processing.

[Impact and Risk]
During the service configuration or protection switching, certain higher order services may be interrupted. In the networking environment of higher order revertive SNCP schemes, the problem may occur more frequently.
[Measures and Solutions]
Recover Measure:
If the protection cross-connect board is faulty, perform a warm reset on this board. If the working cross-connect board is faulty, perform the switching for the cross-connect boards, and then perform a warm reset on the protection cross-connect board (that is, the original working cross-connect board). If the working and protection cross-connect boards are faulty, handle the problem on the protection cross-connect board, and then handle the problem on the working cross-connect board.
Avoidance Measure:
1. Change the higher order revertive SNCP networking mode to the non-revertive SNCPnetworking mode. Thus, the problem occurrence probability can be reduced to a great extent.
2. Check the OptiX OSN 9500 by running the commands (that will be integrated into the patrol tool subsequently) described in Judge Method regularly. If a problem is found, perform the recovery operation for the faulty board based on the method described in Recover Measure in advance to prevent the problem from occurrence.
Solution:
Upgrade the software to V100R004C01SPC100(5.15.04.19P01)version or V100R006C02B018(5.15.06.23)or a version later than V100R006C02B018(5.15.06.23)
For the NES with 5.15.3.24、5.15.04.13、5.15.04.17、5.15.04.18 version, upgrade the SSJ3EXCH to 1.73 version can solve the problem, but the whole NE upgrade is suggested.

 

Categories:

Comments are closed