Monthly Archives: May 2016

When the Software Version Compatibility Issue on the TNF2LQM and TNF2LDGF2 Boards in OptiX OSN 1800

Keywords: WDM products, OptiX OSN 1800

Summary:

For OptiX OSN 1800 V100R003C01SPC200 (5.67.03.23) or an earlier version, services on the

TNF2LQM or TNF2LDGF2 board that uses the service chip of version V110 will be unavailable

after the board is initialized because of a bug. The problem can be resolved by upgrading the

OptiX OSN 1800 to V100R003C01SPC300 (5.67.03.25) or a later version.

[Problem Description]

Trigger conditions:

The problem occurs when the following conditions are present:

The TNF2LQM or TNF2LDGF2 board uses the service chip of version V110.

The NE software version is V100R003C01SPC200 or earlier.

Symptom:

Services are unavailable after the board is powered on and configured with services.

Identification method:

 

Run the following command to query the service chip version on the board:

:woptp:$bid,1,83,1,ff,23,00,04,10,00,2         # “bid” indicates the slot ID of the TNF2LQM or

TNF2LDGF2 board in the hexadecimal format.

The following information may be returned:

Optp cmd : ff23

00 04 10 00 02 87 51 00 11      # “00 11″ indicates that the service chip version is 110.

If “00 11″ is displayed in the returned information, the board is involved in this warning.

 

[Root Cause]

For OptiX OSN 1800 V100R003C01SPC200 and earlier versions, the TNF2LQM and

TNF2LDGF2 boards that use the service chip of version V110 have a bug in board initialization.

 

[Impact and Risk]

Affected scope: in and out of China

Risk: The services are unavailable.

 

[Measures and Solutions]

Recovery measures:

Upgrade the OptiX OSN 1800 to V100R003C01SPC300 or later and perform a cold reset on

the involved board.

Workarounds:

None

Preventive measures:

See “Recovery measures” as described above.

 

TwitterLinkedInGoogle+FacebookPinterestTumblrStumbleUponRedditShare

How to do when Abnormal Optical Power Reporting Caused by the Coupling Exception

Keywords: WDM products, OptiX OSN 6800, OptiX OSN 8800

Summary:

When fibers on boards that use the HXFP8240, HXFP8440, or HXFP8441 pluggable module

are being removed or inserted, the reported optical power at the receive end is sometimes lower

than the actual optical power.

 

[Problem Description]

Trigger conditions:

The problem occasionally occurs when customer fibers are removed and re-inserted.

Symptom:

l In the site deployment phase, customer-purchased fibers are used to connect optical modules.

The reported receive optical power on some boards is over 3 dB lower than the actual optical

power measured using an optical power meter. Sometimes, an IN_POWER_LOW alarm is reported.

The reported optical power may be normal after the fibers are removed and re-inserted for several times.

l When customer-purchased fibers are removed and re-inserted during normal system operating,

the reported receive optical power on some boards is over 3 dB lower than the actual optical power

measured using an optical power meter. Sometimes, an IN_POWER_LOW alarm is reported.

The reported optical power may be normal after the fibers are removed and re-inserted for several times.

Identification method:

The problem occurs when the following conditions are present:

l The problem occasionally occurs during site deployment or a fiber is removed or inserted.

l In the queried board manufacturer information, the optical module type is HXFP8240, HXFP8440,

or HXFP8441, as shown in Figure 1. Or the optical module type indicated on the label attached to the

optical module is HXFP8240, HXFP8440, or HXFP8441, as shown in Figure 2 or Figure 3.

Figure 1 

 

module

module

Figure 2 

 

 

HXFP8441

HXFP8441

Figure 3 

 

HXFP8240

HXFP8240

 

l A customer-purchased fiber is used to connect the optical module. The reported receive optical

module is over 3 dB lower than the actual optical power. After a fixed optical attenuator (FOA) is

connected to the fiber, the reported o0070tical power is normal. For example, the measured actual

optical power is –5.1 dBm. After a 2 dB FOA is connected to the fiber, the reported optical power is

approximately –7 dBm.

 

[Root Cause]

According to the YD1272-1 standard, the LC ferrule elasticity and size B are two major indicators to

determine the quality of an LC connector. If either indicator does not satisfy requirements,

the problem may result.

The elasticity of the LC ferrule does not satisfy the requirement of 5–6 N and is much smaller than

the stress between the receive-end optical device and the ceramic sleeve. Consequently, a gap is

generated between the fiber and the end face of the optical device ferrule and excessive insertion

loss results. As a result, the reported optical power is much lower than the actual optical power.

 

LC

LC

 

The LC size B refers to the fiber area inserted to the optical device, which indicates the contact

area between the fiber and the optical device. The size B of the customer fiber does not satisfy

the requirement and causes excessive insertion loss. As a result, the reported optical power is

much lower than the actual optical power.

 

LS Size B

LS Size B

 

[Impact and Risk]

At the board receive end, the reported optical power is over 3 dB lower than the actual optical

power. Sometimes, an IN_POWER_LOW alarm is reported or the receive-end bit error rate

(BER) deteriorates.

[Measures and Solutions]

Recovery measures:

l Connect a 0 dB, 2 dB, or 3 dB FOA to the fiber before inserting the fiber to an optical module.

l Replace the fiber jumper.

Workarounds:

Preventive measures:

Replace the customer fibers with fibers that satisfy requirements. The fibers provided by Jonhon,

Acon, and Foxconn are recommended.

How to do when the Occasional Failure to Go Online for TN11HSC1 Boards

Keywords: WDM, OptiX OSN 6800, OptiX OSN 8800 T16, OptiX OSN 8800 T32

Summary:

When TN11HSC is upgraded to V100R007C00SPC200, V100R007C00SPC300, or

V100R007C02SPC200, there is a low probability that it fails to go online in an

OptiX OSN 8800 T16 subrack, OptiX OSN 8800 T32/T64 slave subrack, or OptiX OSN 6800

slave subrack. When the fault occurs, the OSC performance is affected, possibly causing the NE

to be unreachable by the NMS.

[Problem Description]

Trigger conditions:

1. The TN11HSC1 board is used in an OptiX OSN 8800 T16 subrack, OptiX OSN 8800 T32/T64

slave subrack, or OptiX OSN 6800 slave subrack.

2. The NE software is V100R007C00SPC200, V100R007C00SPC300, or V100R007C02SPC200.

3. A warm or cold reset is performed on the board.

Symptom:

The TN11HSC1 board occasionally fails to go online in an OptiX OSN 8800 T16 subrack,

OptiX OSN 8800 T32/T64 slave subrack, or OptiX OSN 6800 slave subrack and this board is

unreachable by the NMS when the NE is upgraded to V100R007C00SPC200, V100R007C00SPC300,

or V100R007C02SPC200.

Identification method:

1. Check whether the TN11HSC1 board is used in an OptiX OSN 8800 T16 subrack,

OptiX OSN 8800 T32/T64 slave subrack, or OptiX OSN 6800 slave subrack.

4. Query the version information and check whether the NE version is V100R007C00SPC200,

V100R007C00SPC300, or V100R007C02SPC200 and whether the board version is 3.45.

 

[Root Cause]

The memory of the TN11HSC1 board is insufficient. When this board is used in an above-mentioned

subrack, this board applies more memory space. Therefore, the memory application during the board

software initialization occasionally fails, causing a board initialization failure. As a result, the board

suspends in the BIOS state.

 

[Impact and Risk]

1. When an NE on the live network is upgraded to the risky version, if the subrack is managed through

the OSC on the TN11HSC1 board, the NE will be occasionally unreachable by the NMS and fail to be

upgraded to the target version.

5. When an NE on the live network is upgraded to the risky version, if the subrack is managed through

other channels, the NE will be occasionally unreachable by the NMS and fail to be upgraded to the target version.

6. If the NE on the live network is already of a risky version, the TN11HSC1 board will occasionally fail to

go online when it is inserted into an above-mentioned subrack and be unreachable by the NMS.

 

[Measures and Solutions]

Recovery measures:

1. Upgrade the NE to V100R007C00SPC300 and load the V100R007C00SPH301 hot patch.

7. Remove the TN11HSC1 board and insert it to an OptiX OSN 8800 T32/T64 or OptiX OSN 6800

master subrack. Then load the V100R007C00SPH301 hot patch to the subrack.

8. Insert the TN11HSC1 board that has been loaded with the hot patch back to the OptiX OSN 8800 T16

subrack, OptiX OSN 8800 T32/T64 slave subrack, or OptiX OSN 6800 slave subrack.

Workarounds:

None

Preventive measures:

Upgrade the NE to V100R007C02SPC300 or a later mainstream version.

 

Incorrect Configuration of the ZL80018 Chip on OptiX OSN 1800

Keywords: TNF1LDX, TNF2LSX, TNF2ELOM, OptiX OSN 1800

Summary:

On the TNF1LDX, TNF2LSX, and TNF2ELOM boards intended for the OptiX OSN 1800 of

V100R003C01SPC300 and earlier versions, the ZL80018 chips in version H occasionally fail

to be configured. As a result, bit errors are generated on the WDM side of the peer end or services

are unavailable. This issue has been rectified in version V100E003C01SPC300.

 

[Problem Description]

Trigger condition:

This issue occasionally occurs when the TNF1LDX, TNF2LSX, or TNF2ELOM board intended

for OptiX OSN 1800 is powered off or (cold) reset, or the services on any of the boards are changed.

Symptom:

After the TNF1LDX (03021ETT), TNF2LSX (03021FTM), or TNF2ELOM (03030MSG) board used

during the network deployment or on the live network is powered off or (cold) reset, or the services

on any of the boards are changed, the services will be unavailable or bit errors will be generated.

The typical symptom is as follows: An ODU0_LOFLOM, ODU1_LOFLOM, ODU2_PM_BDI, or

BEFFEC_EXC alarm is reported on the WDM side of the peer board. An ODU0_PM_BDI,

ODUFLEX_PM_BDI, or ODU1_PM_BDI alarm is reported on the WDM side of the local board.

Identification method:

You can determine that this issue occurs when the following conditions are met:

1. After the TNF1LDX, TNF2LSX, or TNF2ELOM board used is powered off or (cold) reset, or the

services on any of the boards are changed, the services are unavailable or bit errors are generated.

To be specific, an ODU0_LOFLOM, ODU1_LOFLOM, ODU2_PM_BDI, or BEFFEC_EXC alarm

is reported on the WDM side of the peer board, and an ODU0_PM_BDI, ODUFLEX_PM_BDI,

or ODU1_PM_BDI alarm is reported on the WDM side of the local board.

2. The NE software version satisfies either of the following requirements:

The NE software version is earlier than V100R003C01SPC300 and is not upgraded after a fault occurs.

The NE software version is earlier than V100R003C01SPC300 when a fault occurs and then is

upgraded to V100R003C01SPC300 or later, but no cold reset is performed on the TNF1LDX,

TNF2LSX, or TNF2ELOM board.

3. The ZL80018 chip in version H is used and the boards satisfy the following requirements:

TNF1LDX: manufactured after December 14, 2012 and board barcode being numbered after

021ETT10CB000492

TNF2LSX: manufactured after December 14, 2012 and board barcode being numbered after

021FTM10CC000157

TNF2ELOM: manufactured after December 28, 2012 and board barcode being numbered after

030MSG10CC000745

 

[Root Cause]

The ZL80018 chip has a defect. It may fail to be configured, which has a higher probability for

ZL80018 chips in version H. The TNF1LDX, TNF2LSX, and TNF2ELOM boards intended for

OptiX OSN 1800 do not strictly comply with the latest erata code standards of the ZL80018 chip

and cannot completely prevent the defect. As a result, the ZL80018 chip in version H used on the

TNF1LDX, TNF2LSX, and TNF2ELOM boards intended for OptiX OSN 1800 earlier than

V100R003C01SPC300 occasionally fails to be configured. Consequently, the clock performance

is affected and the services on the boards are unavailable or bit errors are generated.

 

[Impact and Risk]

For OptiX OSN 1800 earlier than V100R003C01SPC300, when the TNF1LDX, TNF2LSX, or

TNF2ELOM board using the ZL80018 chip in version H is powered off, (cold) reset, or services

are changed, bit errors will be occasionally generated or services will be occasionally unavailable.

This issue, however, will not occur during the normal running of the boards.

 

[Measures and Solutions]

Recovery measures:

Perform a cold reset on the involved boards (sometimes cold resets may need to be performed several times).

Workarounds:

None

Preventive measures:

This issue has been rectified since OptiX OSN 1800 V100R003C01SPC300. Therefore, upgrade the NE

software version to OptiX OSN 1800 V100R003C01SPC300 or later. When this issue occurs, upgrade

the NE software and perform a cold reset on the TNF1LDX, TNF2LSX, or TNF2ELOM board.

When Service Interruption on 100G and 40G Boards in WDM Products

When Service Interruption on 100G and 40G Boards in WDM Products

 

Keywords: WDM products, Transport network product line

Summary:

When the SmartKit Inspector tool is used to inspect a 100G or 40G board, it issues a command to make the

board enter the equipment state if three consecutive command execution failures are returned. As a result,

services are interrupted.

[Problem Description]

Trigger condition:

The problem occurs when the following conditions are present:

l The SmartKit Inspector whose version is V200R008C00 and its WDM script version is earlier than

20140127221342093 is used to perform a board inspection.

l During the inspection, the board module log is uploaded (the log in the module flash is copied to the board),

or the communication between the board and its module becomes abnormal (a MOD_COM_FAIL or

PORT_MODULE_OFFLINE alarm is reported).

l The board under inspection is TN54NS4, TN54NS4(REG), TN15LSXL, or TN53TSXL.

Symptom:

Services are interrupted.

Identification method:

Same to trigger condition.

 

[Root Cause]

When the board module log is being uploaded (the log in the module flash is copied to the board), or the

communication between the board and its module becomes abnormal, the board cannot respond to the

inspection commands. If the SmartKit Inspector does not receive command responses for three times,

it run a command to make the board enter the equipment state.

After the board enters the equipment state, The board (TN54NS4, TN54NS4(REG), TN15LSXL, or

TN53TSXL) will automatically switch the service type or clock to ensure normal equipment test

functions, causing service interruption.

 

[Impact and Risk]

Services are interrupted.

 

[Measures and Solutions]

Recovery measures:

Perform a cold reset on a faulty board.

Workarounds:

None

Preventive measures:

If SmartKit Inspector is V200R008C00 version, upgrade the WDM script to 20140127221342093 or a

later version or directly upgrade to SmartKit Inspector V200R009C00 or a later version. For details on

how to query the WDM script version, see the Attachment.

 

[Attachment]

Do as follows to query the tool script version:

In the sKeeper, choose Applications and click Local version for the Inspector. In the window that is

displayed, click the patch package for WDM products. The script version of the SmartKit Inspector tool

will be displayed in the Details area.

You need to upgrade the tool if the queried version is earlier than 20140127221342093.

 

WDM

WDM

WDM

WDM

 

 

When Incorrect Fan Rotation Speed Adjustment of the OptiX OSN NG WDM Products

Keywords: WDM products,OptiX OSN NG WDM

Summary:

For the OptiX OSN NG WDM of a version earlier than V100R006C01SPC200, subrack temperature

detection is implemented by the SCC board in the larger-number slot. It is implemented by the SCC

board in the smaller-number slot only when the SCC board in the larger-number slot is asserted

offline. The fan rotation speed control however, is always implemented by the active SCC board

based on its obtained subrack temperature. When the active SCC board is the SCC board in the

smaller-number slot, the fan rotation speed does not match with the subrack temperature.

 

[Problem Description]

Trigger condition:

When all the following conditions are met, the problem will occur:

1. The NE is an OptiX OSN NG WDM NE.

2. The NE version is earlier than V100R006C01SPC200.

Involved mainstream versions are as follows:

OptiX OSN 6800/3800 V100R004C04SPC800

OptiX OSN 8800 V100R002C02SPC800

OptiX OSN 8800/6800/3800 V100R005C00SPC700

OptiX OSN 8800/6800/3800 V100R005C00SPC800

OptiX OSN 8800/6800/3800 V100R005C00SPC900

OptiX OSN 8800/6800/3800 V100R006C00SPC100

 

3. Both the active and standby SCC boards are online. In addition, the active SCC board is in a

slot of the smaller-number while the standby SCC board is in the larger-number slot.

SCC

SCC

Figure 1 shows the slot layout of the OptiX OSN 3800 (DC powered).

 

Chassis

Chassis

Note: The board in the blue circle in the figure above indicates the SCC board in the

smaller-number slot, and that in the red circle indicates the SCC board in the larger-number slot.

 

Figure 2 shows the slot layout of the OptiX OSN 6800.

 

 

Subrack

Subrack

Figure 3 shows the slot layout of the OptiX OSN 8800 T32.

 

 

OSN 8800

OSN 8800

Figure 4 shows the slot layout of the OptiX OSN 8800 T64.

 

 

OSN 8800 T64

OSN 8800 T64

Symptom:

The fan rotation speed does not match the subrack temperature.

Identification method:

Both the active and standby SCC boards are online. In addition, the active SCC board

is in a slot of the smaller-number while the standby SCC board is in the larger-number slot.

 

[Root Cause]

For the OptiX OSN NG WDM of a version earlier than V100R006C01SPC200, subrack

temperature detection is implemented by the SCC board in the larger-number slot. It is

implemented by the SCC board in the smaller-number slot only when the SCC board in

the larger-number slot is asserted offline. The fan rotation speed control however, is

always implemented by the active SCC board based on its obtained subrack temperature.

When the active SCC board is the SCC board in the smaller-number slot, the fan rotation

speed does not match with the subrack temperature.

 

[Impact and Risk]

1. If the subrack temperature is low but the fan rotation speed is high, the life span of the

FAN board will be shortened.

2. If the subrack temperature is high but the fan rotation speed is low, the FAN board

will be burnt.

 

[Measures and Solutions]

Recovery measures:

Perform an active/standby SCC switchover so that the active SCC board is in the larger-number

slot specific for the SCC board.

Workarounds:

See “Recovery measures” as described previously.

Preventive measures:

Upgrade the NE version to V100R006C01SPC200 or a later mainstream version.

When Interruption of Existing Services on Some Data Boards of OSN Products

Keywords: MSTP, SDH, OptiX OSN 9500

Abstract: For an SSN4EGS4/SSN5EFS0/SSN3EFS4/SSN3EGS2/SSN1EFS0A/SSN1EMS2

board of version 2.44 or earlier, an SSN2EGT2 board of version 2.19 or earlier, and SSJ6EGT6A

board of all static versions, binding timeslots on new VCTRUNKs with LCAS enabled may

result in the interruption of existing services after the boards are warm reset.

[Problem Description]

Trigger conditions:

This problem can be identified under the following conditions:

1. A board of a preceding version is used.

2. On an SSN4EGS4 board, the fifth VC-4 timeslot or the 13th VC-4 timeslot is occupied by services.

For an SSN2EGT2/SSN5EFS0/SSN3EFS4/SSN3EGS2/SSN1EFS0A/SSN1EMS2/SSJ6EGT6A board,

the first VC-4 timeslot is occupied by services.

3. The last reset performed on the board is a warm reset.

4. After the board is warm reset, VC-3 timeslots (the issue is irrelevant with VC-12 or VC-4 timeslots)

are configured for a VCTRUNK and LCAS is enabled for the board (there is no requirement on the

order of enabling the LCAS function and timeslot binding).

For an SSN4EGS4 board:

If the VC-3 timeslot corresponding to any of the first to eighth VC-4 timeslots is newly bound with

services, services occupying the fifth VC-4 timeslot on the board are interrupted. If the VC-3 timeslot

corresponding to any of the 9th to 16th VC-4 timeslots is newly bound with services, services

occupying the 13th VC-4 timeslot on the board are interrupted.

For an SSN2EGT2/SSN5EFS0/SSN3EFS4/SSN3EGS2/SSN1EFS0A/SSN1EMS2/SSJ6EGT6A board:

If the VC-3 timeslot corresponding to ant of the first to eighth VC-4 timeslots is newly bound with services,

services occupying the first VC-4 timeslot on the board are interrupted.

Symptom:

After certain VC-3 timeslots on one of the preceding boards (which has been warm reset lately) are newly

bound with services, some existing services on the board are interrupted. The VCTRUNK corresponding

to the interrupted services reports an ALM_GFP_DLFD alarm.

Identification method:

1. An SSN4EGS4/SSN5EFS0/SSN3EFS4/SSN3EGS2/SSN1EFS0A/SSN1EMS2 board of version 2.44 or

earlier, an SSN2EGT2 board of version 2.19 or earlier, or an SSJ6EGT6A board of all versions is used.

2. The board has been warm reset, including warm reset after the board software is upgraded.

3. Timeslot binding is configured for the board and the configuration interrupts existing services.

The VCTRUNK corresponding to the interrupted services reports an ALM_GFP_DLFD alarm.

[Root Cause]

For an SSN4EGS4 board of version 2.44 or earlier, the software has a defect. After a warm reset,

the MST not used by any VCTRUNKs extract VC-4 timeslots that are mistakenly numbered as the

fifth or 13th VC-4 timeslot after the initialization. As a result, if another timeslot is bound, configuration

of the original fifth or 13th VC-4 timeslot is changed and the corresponding services are interrupted.

For an SSN5EFS0/SSN3EFS4/SSN3EGS2/SSN1EFS0A/SSN1EMS2 board of version 2.44 or earlier,

an SSN2EGT2 board of version 2.19 or earlier, or an SSJ6EGT6A board of all static versions, the

software has the same defect. If a new timeslot is bound, configuration of the original first VC-4

timeslot is changed and the corresponding services are interrupted.

[Impact and Risk]

After the board is warm reset, configuring new services may interrupt existing services. Deleting

the new services cannot recover the existing services.

[Measures and Solutions]

Recovery measures:

The existing services can be recovered by using the following methods.

Delete and re-configure the timeslot binding for the interrupted services. The LCAS function

does not need to be disabled.

Cold reset on the board.

Workarounds:

After a board configured with timeslot binding is warm reset, do not enable the LCAS function for

new VCTRUNKs if new timeslots need to be bound with services.

Solution:

Upgrade an NG SDH NE to V100R010C03SPC203 or a later version to eliminate the software defect

on an SSN4EGS4/SSN5EFS0/SSN3EFS4/SSN3EGS2/SSN1EFS0A/SSN1EMS2 board;

 

upgrade an NG SDH NE to V100R010C03SPC215 or a later version, or upgrade NGSDH NE to

V1R10C03SPC208+SPH217 or later hot patch to eliminate the software defect on an SSN2EGT2

board. V1R10C03SPH217 will be released in Aug 2014.

 

For OptiX OSN 9500, the defect has not been rectified on any static version and will be rectified

on versions later than V100R006C05SPC208. For an online issue, install V100R006C03SPH206

or a later hot patch or install V100R006C05SPH209 or a later hot patch to solve the issue.

[Inspector Applicable or Not]

Applicable.

Inspection case title: Pre-Warning Risk on Inspecting SD598 Data Boards

How to do when Bit Errors Occur in Lower Order Path Due to Improper Grounding

To clear bit errors that occur due to loose connection of the PGND cable, reconnect the PGND cable.

Fault Type

Bit errors

Symptom

In the transmission network, five sites form an MSP ring. The five sites are names as Site 1, Site 2,

Site 3, Site 4, and Site 5 in the counter-clockwise direction. Site 1 is the gateway NE. Only 2 Mbit/s

services are activated between Site 1 and other sites. The clock source of Site 1 is the external clock

source, and the other sites trace the west line clock source. Query the alarms and performance

events. It is found that a large number of bit errors exist in the lower order path at Site 1, Site 2,

and Site 3. At the same time, bit errors also exist in the lower order path at Site 4 and Site 5.

Cause Analysis

  • If bit errors exist in the lower order path at each site, the possible causes are as follows:

Site 1 has services with all the other sites. Hence, the fault may exist at Site 1. There is

little probability that all the PQ1 boards at Site 1 are faulty. Then, it is suspected that

the line board SL16 is faulty. The fan may be faulty. In this case, bit errors occur in the

higher order path, which results in bit errors in the lower order path.

  • The 2M cables and fibers may be faulty or the cable connection may be faulty at Site 1.

 

Procedure

1. Query the history performance data. Reset the performance, then query the current

performance. It is found that bit errors persist.

2. Check the line boards at each site. It is found that there is no bit error in the higher order path.

3. Check the fan. It is found that there are no anomalies but bit errors persist.
4. Check the equipment environment. It is found that the PGND cable is improperlyconnected.  
Fasten the PGND cable, then query the performance. It is found that bit errors are cleared.

Reference Information

None.

What Caused Occasional NE Unreachability

Keywords: WDM products, OptiX OSN 6800, OptixOSN 8800

Summary:

For an NE of a version earlier than OptiX OSN 6800 V100R004C04SPC800 or OptiX OSN 8800

V100R002C02SPC800, there is a low probability that the NE is unreachable by the NMS due to the

QFull count error in the DCN communication.

[Problem Description]

Trigger condition:

The problem may occur when either of the following conditions is present:

l The NE version is earlier than OptiX OSN 6800 V100R004C04SPC800 and the TN11SCC or

TN51SCC board is installed on the NE, or the NE version is between V100R004C00 and V100R004C04SPC800

(excluding V100R004C04SPC800) and the TN52SCC board is installed on the NE.

l The NE version is earlier than OptiX OSN 8800 V100R002C02SPC800 and the TN11SCC or TN51SCC board is

installed on the NE, or the NE version is between V100R002C00 and V100R002C02SPC800 (excluding

V100R002C02SPC800) and the TN52SCC or TNK2SCC board is installed on the NE.

Symptom:

The NE is unreachable by the NMS. If the NE is a gateway NE (GNE), all its subtending NEs will be unreachable

by the NMS.

Identification method:

1. Check whether the version of an unreachable NE is one of the versions listed in the preceding table.

2. Use the UpgradeKit tool to log in to the GNE of the NE in the same way as how you log in to a GNE using the

Navigator. Click Login on the tool main menu, and then select Gateway IP and click OK in the Login dialog

box that is displayed.

 

OTN

OTN

 

3. On the tool main menu, click Comm. In the dialog box that is displayed, select the unreachable NE and its

GNE, and then click Check. In the example shown in the following figure, NE 9-226 is unreachable. Therefore,

both NE 9-226 and its GNE 9-228 are selected. Click OK in the Select Check Items dialog box that is displayed.

 

 

OTN

OTN

 

OTN

OTN

4. View the Check QFull on the NE item in the output results. If the check result is not “OK”, the NE has a

QFull count error.

 

OTN

OTN

 

Note: The methods of using the UpgradeKit and Navigator tools are the same. Both tools can be used to remotely

log in to a subtending NE through its GNE. When the UpgradeKit tool is directly connected to a GNE using a

network cable or through Ethernet, the tool does not communicate with the GNE over a DCN channel. Therefore,

communication between the tool and NEs are not affected regardless of whether the NEs are unreachable.

The UpgradeKit tool can log in to the GNE when it is directly connected to the GNE. Generally, a remote login can

be implemented using the above-mentioned method if a faulty NE is not severely unreachable. If the GNE fails to

be logged in to, directly connect the UpgradeKit tool to an unreachable NE on site and then log in to the NE.

 

[Root Cause]

The product software has a bug in collecting statistics on the sending queues of DCN channels. When the sending

queue count reaches the maximum value, the queue buffering capability fails and subsequent packets are directly

discarded. Consequently, the QFull count of the DCN channel on an involved NE becomes abnormal, causing the

NE to become unreachable by the NMS.

 

[Impact and Risk]

When the fault occurs, the NE becomes unreachable and the DCN communication cannot be automatically restored.

The NMS cannot manage the NE through DCN or monitor service alarms and performance in real time.

 

[Measures and Solutions]

Recovery measures:

Perform a warm reset on the system control board.

Workarounds:

Perform a warm reset on the system control board.

Preventive measures:

You are advised to upgrade the NE to the target version:

l For OptiX OSN 6800: OptiX OSN 6800 V100R004C04SPC800 (5.51.05.35) or later

l For OptiX OSN 8800: OptiX OSN 8800 V100R002C02SPC800 (5.51.05.35) or later