A fabric link on module 0 has degraded, some packets may be corrupted – bigsurusd

Issue: –   A fabric link on module 0 has degraded, some packets may be corrupted  – bigsurusd

Device:-  Nexus 5k

Details: –

In a Nexus 6000/5600 series switches, link degraded messages such as following can be seen between port ASIC and fabric ASIC.


USER-2-SYSTEM_MSG: A fabric link on module 3 has degraded, some packets may be corrupted – bigsurusd
USER-2-SYSTEM_MSG: A fabric link on module 3 has degraded, some packets may be corrupted – bigsurusd
USER-2-SYSTEM_MSG: A fabric link on module 3 has degraded, some packets may be corrupted – bigsurusd

: error on another switch

2019 Apr 15 00:59:18.844 HKT: last message repeated 11 times

 2019 Apr 15 00:59:18.844 HKT: %USER-2-SYSTEM_MSG: A fabric link on module 0 has degraded, some packets may be corrupted  – bigsurusd

 2019 Apr 15 00:59:18.854 HKT: last message repeated 1 time

 2019 Apr 15 00:59:18.854 HKT: %USER-2-SYSTEM_MSG: A fabric link on module 2 has degraded, some packets may be corrupted  – bigsurusd

 2019 Apr 15 00:59:20.864 HKT: last message repeated 1 time

 2019 Apr 15 00:59:20.864 HKT: %USER-2-SYSTEM_MSG: A fabric link on crossbar asic 0 has degraded  – pacifica

 2019 Apr 15 00:59:23.874 HKT: last message repeated 11 times

Solution:

 A power cycle of LEM will recover the issue if the issue if due to software.  Upgrade the switch as per bug CSCuo69417

The fabric can degrade due to hardware issues, in that case we need to replace the device.

%NOHMS-2-NOHMS_DIAG_ERROR

Issue: –  %NOHMS-2-NOHMS_DIAG_ERROR

Details: –

show interface brief

Eth1/37       1       eth  fabric down    Hardware failure            10G(D) 107

Eth1/39       1       eth  f-path down    Hardware failure            10G(D) 2

Eth1/42       1       eth  fabric down    Hardware failure            10G(D) 107

Eth1/44       1       eth  access down    Hardware failure            10G(D) —

Eth1/45       1       eth  access down    Hardware failure           1000(D) —

Eth1/48       1       eth  f-path down    Hardware failure            10G(D) 2

2017 Sep  7 09:12:27.474 SWITCHNAME %NOHMS-2-NOHMS_DIAG_ERROR: Module 1: Runtime diag detected major event: Fabric port failure

Ethernet1/44

2017 Sep  7 09:12:27.630 SWITCHNAME %NOHMS-2-NOHMS_DIAG_ERROR: Module 1: Runtime diag detected major event: Fabric port failure

Ethernet1/42

2017 Sep  7 09:12:27.642 SWITCHNAME %NOHMS-2-NOHMS_DIAG_ERROR: Module 1: Runtime diag detected major event: Fabric port failure

 show diagnostic result module 1

<snip>

    16) TestFabricPort :

   Eth    1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

   Port ————————————————————

          .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .

   Eth   21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

   Port ————————————————————

          .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  F  .  F  .

   Eth   41 42 43 44 45 46 47 48

   Port ————————

          .  F  .  F  F  .  .  F

Solution: – The above output indicates that there was a hardware failure on these ports for the ASIC. You may be able to recover the port from this state by dong a “shut / n shut” on interface or you can try swapping the SFP. It is also possible that rebooting the switch may bring the ports back up.

 If the port remains in  “hardware failure” state, then it may be necessary to replace the chassis/module

%ATOM_TRANS-6-ATOM_NO_ROUTER_ID error

Issue: –  %ATOM_TRANS-6-ATOM_NO_ROUTER_ID error

Details: –

013964: Jun  6 11:33:23.245 AWST: %ATOM_TRANS-6-ATOM_NO_ROUTER_ID: No router ID is available for AToM to use and this will impact pseudowire VCCV. Please enable “l2 router-id <address>” or enable an LDP router ID if you wish VCCV to be operational.

Disabling some features, the below message was generated without AToM configuration;

“%ATOM_TRANS-6-ATOM_NO_ROUTER_ID:

No router ID is available for AToM to use and this will impact pseudowire VCCV.

Please enable “l2 router-id” or enable an LDP router ID if you wish VCCV to be optional.”

Conditions:

AToM is not used.

Overall below features are turned off;

PAD, WRR, AAA, DHCP, SCTP, IPV6, IGMP, PIM, RSVP, MGCP, MPLS, HTTP(HTTPS), FTP, SSH, IPSEC

Workaround:

As per the bug.

If enabling “mpls ip”, the message was not generated.

Solution: – addressed in Bug CSCuv45551

NETSTACK-5-INVALID_NOTIFY

Issue: –  NETSTACK-5-INVALID_NOTIFY

Details: –

NETSTACK-5-INVALID_NOTIFY:” error messages are cosmetic and will not going to impact the Services.

In order to resolve this issue, please decrease the logging level for netstack to 3.

# logging level netstack 3

NX_SW8426 %NETSTACK-3-IP_INTERNAL_ERROR: netstack [3426] Failed to get IP VRF name 0

%NETSTACK-3-IP_INTERNAL_ERROR: netstack [3426] -Traceback: 0x80d6475 0x83d260f 0x84208e7 0x84246dc 0x840f848 0x83ab1bc 0x816a7e7 0x816b
811 0x81754c8 0x8161a1a 0x80ab8cd 0x8081393 0x80b91ec librsw.so+0xab02f libpthread.so.0+0x6140 libc.so.6+0xca8c

NEXUS 5548

2019 Oct 3 18:07:02.586 SWITCHNAME %NETSTACK-5-INVALID_NOTIFY: netstack [3593] ip_is_addr_local_vip:Failed to get IP API VRF 0

2019 Oct 3 18:07:02.587 SWITCHNAME %NETSTACK-3-IP_INTERNAL_ERROR: netstack [3593] Failed to get IP VRF by name (null)

2019 Oct 3 18:07:02.589 SWITCHNAME %NETSTACK-3-IP_INTERNAL_ERROR: netstack [3593] -Traceback: 0x80d9e2f 0x80d9ee2 0x817ce62 0x817d19a 0x817e1d6 0x816c630 0x8176628

0x8162ad9 0x80ab98d 0x8081423 0x80b97dc librsw.so+0xab02f libpthread.so.0+0x6140 libc.so.6+0xca8ce

Solution: – addressed in Bug CSCur20112. Upgrade the device to avoid this error

“no power resource” nexus 2k

Details:

Fan Fex : 101:
——————————————————
Fan Model Hw Status
——————————————————
Chassis N2K-C2248-FAN — ok
PS-1 N2200-PDC-400W — no power source <<<<<<<<<
PS-2 N2200-PDC-400W — no power source <<<<<<<<<

fex-101# show platform software satctrl trace | no-more


06/12/2014 20:33:30.663031: Err:Unable to read PS module presence 0
06/12/2014 20:33:30.663077: Err:Unable to read PS module presence 1
06/12/2014 20:33:31.869076: Err:Unable to read fantray 1 presence bit
06/12/2014 20:33:31.869146: Err:Unable to read PS module presence 0
06/12/2014 20:33:31.869191: Err:Unable to read PS module presence 1
06/12/2014 20:33:32.882099: Err:Unable to read fantray 1 presence bit

The exact trigger is not clear yet but seen on DC PSU on N2K connected to N7K in code 6.2(2a).

This is a cosmetic issue on the cli output, no impact on the FEX.

Solution:-

Reload can temporary fix the issue. For permanent fix upgrade to bug (CSCup42901) fixed version

SSH-3-PACK_INTEG_ERROR error on 4500 switch

Logs shows the below error

 “%SSH-3-PACK_INTEG_ERROR: Packet integrity error (4 bytes remaining)”

Details:

This message is a warning message that indicates a SSHv2 packet was received and there was some left over data past the End of Message. This could be something wrong with the packet or it could be left over data on the channel (abnormal close of the channel, etc).

To confirm that packets are corrupted on the way we need to do as follows:
– configure SPAN port on the switch and perform packet capture of incoming packets to the switch itself
– perform packet capture on the client itself at the same time to be able to compare that packets were not corrupted on the way

Normally if the message is occurring periodically then it should just be taken as informational and ignored. There is no functional impact resulting from the message with the exception of the SSH session being closed.

IPSEC Tunnel using IKEv2 and CryptoMap

DC-01

Step 1: – Create IKEv2 Proposal (Same as isakmp policy in IKEv1)

crypto ikev2 proposal IKE-Proposal

 encryption aes-cbc-128 3des

 integrity sha1

 group 5

Step 2:- Create IKEv2 Policy ( Here we are calling the previously created proposal)

crypto ikev2 policy IKE-Policy

 match address local 11.10.12.1

 proposal IKE-Proposal

Step 3: – Create Keyring (Configure keyring if the local or remote authentication method is a pre-shared key)

crypto ikev2 keyring IKE-Keyring

 peer Branch

  address 11.10.23.1

  pre-shared-key local Cisco123

  pre-shared-key remote Cisco123

Step 4: Create IKEv2 Profile.

An IKEv2 profile is a repository of nonnegotiable parameters of the IKE SA and the services available to the authenticated peers that match the profile

crypto ikev2 profile IKEv2-Profile

 match identity remote address 11.10.23.1 255.255.255.255

 authentication remote pre-share

 authentication local pre-share

 keyring local IKE-Keyring

Step 5:  Create ACL (interesting traffic) ,Transform-Set and Crypto-map and routing for the peer end subnet via the ISP link

ip access-list extended IKE-ACL

 permit ip 192.168.1.0 0.0.0.255 192.168.2.0 0.0.0.255

crypto ipsec transform-set transform1 esp-3des esp-sha-hmac

 mode tunnel

crypto map cmap 1 ipsec-isakmp

 set peer 11.10.23.1

 set transform-set transform1

 set ikev2-profile IKEv2-Profile

 match address IKE-ACL

Apply crypto-map on the interface

int eth 0/0

crypto map cmap

ip route 192.168.2.0 255.255.255.0 11.10.12.2

Branch: –

Step 1: – Create IKEv2 Proposal (Same as isakmp policy in IKEv1)

crypto ikev2 proposal IKE-Proposal

 encryption aes-cbc-128 3des

 integrity sha1

 group 5

Step 2:- Create IKEv2 Policy ( Here we are calling the previously created proposal)

crypto ikev2 policy IKE-Policy

 match address local 11.10.23.1

 proposal IKE-Proposal

Step 3: – Create Keyring (Configure keyring if the local or remote authentication method is a pre-shared key)

crypto ikev2 keyring IKE-Keyring

 peer DC

  address 11.10.12.1

  pre-shared-key local Cisco123

  pre-shared-key remote Cisco123

Step 4: Create IKEv2 Profile.

An IKEv2 profile is a repository of nonnegotiable parameters of the IKE SA and the services available to the authenticated peers that match the profile

crypto ikev2 profile IKEv2-Profile

 match identity remote address 11.10.12.1 255.255.255.255

 authentication remote pre-share

 authentication local pre-share

 keyring local IKE-Keyring

Step 5:  Create  ACL ,Transform-Set and Crypto-map and routing for the peer end subnet via the ISP link

ip access-list extended IKE-ACL

 permit ip 192.168.2.0 0.0.0.255 192.168.1.0 0.0.0.255

crypto ipsec transform-set transform1 esp-3des esp-sha-hmac

 mode tunnel

crypto map cmap 1 ipsec-isakmp

 set peer 11.10.12.1

 set transform-set transform1

 set ikev2-profile IKEv2-Profile

 match address IKE-ACL

int eth 0/0

crypto map cmap

ip route 192.168.1.0 255.255.255.0 11.10.23.2

Verification: –

Ping from PC1 to PC2

Show crypto IKEv2 on DC1 – Status is READY

Could see the packet encaps and decaps

Check the hit count on ACL

NAT-4-DEFAULT_MAX_ENTRIES: default maximum entries value 131072 exceeded; frame dropped

Issue: – NAT-4-DEFAULT_MAX_ENTRIES: default maximum entries value 131072 exceeded; frame dropped

Logs: –

%IOSXE-4-PLATFORM: SIP1: cpp_cp: QFP:0.0 Thread:000 TS:00041531217928775544 %NAT-4-DEFAULT_MAX_ENTRIES: default maximum entries value 131072 exceeded; frame dropped

%IOSXE-4-PLATFORM: SIP1: cpp_cp: QFP:0.0 Thread:000 TS:00041531226749139224 %NAT-4-DEFAULT_MAX_ENTRIES: default maximum entries value 131072 exceeded; frame dropped

%IOSXE-4-PLATFORM: SIP1: cpp_cp: QFP:0.0 Thread:000 TS:00041531234452426888 %NAT-4-DEFAULT_MAX_ENTRIES: default maximum entries value 131072 exceeded; frame dropped

Solution: –

increase the nat entry limit:

Config t

ip nat translation max-entries <number>

3650/3850 Output drops on interfaces

Issue details: – here we are focusing on G1/0/48

Port        Align-Err     FCS-Err    Xmit-Err     Rcv-Err  UnderSize  OutDiscards

Gi1/0/1             0           0           0           0          0            0

Gi1/0/2             0           0           0           0          0            0

Gi1/0/3             0           0           0           0          0            0

Gi1/0/4             0           0           0           0          0            0

Gi1/0/5             0           0           0           0          0    711746178

Gi1/0/6             0           0           0           0          0    358279966

Gi1/0/7             0           0           0           0          0   2144859618

Gi1/0/8             0           0           0           0          0   1379875758

.

.

Gi1/0/48            0           1           0           2          0      4750857

Switch # show int g1/0/48

GigabitEthernet1/0/48 is up, line protocol is up (connected: TDR running)

  Hardware is Gigabit Ethernet, address is 7c0e.ce7e.efb0 (bia 7c0e.ce7e.efb0)

  Description:

  MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec,

     reliability 255/255, txload 5/255, rxload 25/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 100Mb/s, media type is 10/100/1000BaseTX

  input flow-control is on, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input never, output 00:00:00, output hang never

  Last clearing of “show interface” counters 1d04h

  Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 4750857

Logs to Capture: –

16.x IOS XE version

show platform hardware fed switch 1 qos queue config interface gi 1/0/48

show platform hardware fed switch 1 qos queue stats interface gi 1/0/48

show platform hardware fed switch 1 qos dscp-cos counters interface gi 1/0/48

show int counters errors module 1

show int g1/0/48

3.x version:

show platform qos queue stats GigabitEthernet 1/0/48

sh platform qos  queue config GigabitEthernet 1/0/48

show int g1/0/48

show int counters errors module 1

show platform hardware fed switch 1 qos queue config interface gi/0/48

DATA Port:18 GPN:48 AFD:Disabled QoSMap:0 HW Queues: 144 – 151

  DrainFast:Disabled PortSoftStart:2 – 1080

———————————————————-

   DTS  Hardmax  Softmax   PortSMin  GlblSMin  PortStEnd

  —– ——–  ——–  ——–  ——–  ———

 0   1  5   120   2   480   6   320   0     0   4  1440

 1   1  4     0   6   720   3   480   2   180   4  1440

 2   1  4     0   5     0   5     0   0     0   4  1440

 3   1  4     0   5     0   5     0   0     0   4  1440

 4   1  4     0   5     0   5     0   0     0   4  1440

 5   1  4     0   5     0   5     0   0     0   4  1440

 6   1  4     0   5     0   5     0   0     0   4  1440

 7   1  4     0   5     0   5     0   0     0   4  1440

 Priority   Shaped/shared   weight  shaping_step

 ——–   ————-   ——  ————

 0      0     Shared            50           0

 1      0     Shared            75           0

 2      0     Shared         10000           0

 3      0     Shared         10000           0

 4      0     Shared         10000           0

 5      0     Shared         10000           0

 6      0     Shared         10000           0

 7      0     Shared         10000           0

   Weight0 Max_Th0 Min_Th0 Weigth1 Max_Th1 Min_Th1  Weight2 Max_Th2 Min_Th2

   ——- ——- ——- ——- ——- ——-  ——- ——- ——

 0       0     478       0       0     534       0       0     600       0

 1       0     573       0       0     641       0       0     720       0

 2       0       0       0       0       0       0       0       0       0

 3       0       0       0       0       0       0       0       0       0

 4       0       0       0       0       0       0       0       0       0

 5       0       0       0       0       0       0       0       0       0

 6       0       0       0       0       0       0       0       0       0

 7       0       0       0       0       0       0       0       0       0

SWITCH# show platform hardware fed switch 1 qos queue stats interface gi1/0/48

DATA Port:18 Enqueue Counters

——————————-                         

Queue Buffers Enqueue-TH0 Enqueue-TH1 Enqueue-TH2       

—– ——- ———– ———– ———–

    0       0           0  4977682796  3283396622

    1       0           0           0  8721427714

    2       0           0           0           0

    3       0           0           0           0

    4       0           0           0           0

    5       0           0           0           0

    6       0           0           0           0

    7       0           0           0           0

DATA Port:18 Drop Counters

——————————-                    

Queue Drop-TH0    Drop-TH1    Drop-TH2    SBufDrop    QebDrop

—– ———– ———– ———– ———– ———–

    0           0           0           0           0           0

    1           0           0    75741033           0           0

    2           0           0           0           0           0

    3           0           0           0           0           0

    4           0           0           0           0           0

    5           0           0           0           0           0

    6           0           0           0           0           0

    7           0           0           0           0           0

SOLUTION:

The drops are due to default QOS setting. We need to increase the Soft Buffer on the interface using the command

In 16.x version

Config t

qos queue-softmax-multiplier 1200

3.x Version

Refer the below Cisco URL.

https://www.cisco.com/c/en/us/support/docs/switches/catalyst-3850-series-switches/200594-Catalyst-3850-Troubleshooting-Output-dr.html