Thursday, March 24, 2016

High CPU and Latency to VSS 4500X Control Plane

While troubleshooting a high latency issue on a Cisco 4500X we determined the problem to be a 30Mb/sec stream of UDP syslog traffic streaming into a host that was shutdown and therefor the ARP entry was removed:

4500x-switch#show ip arp vrf yellow-zone
Protocol  Address          Age (min)  Hardware Addr   Type   Interface
Internet           0   Incomplete      ARPA

We saw the CPU go high:

4500x-switch#show processes cpu sorted
Core 0: CPU utilization for five seconds: 99%; one minute: 87%;  five minutes: 49%
Core 1: CPU utilization for five seconds: 1%; one minute: 13%;  five minutes: 51%
PID    Runtime(ms) Invoked  uSecs  5Sec     1Min     5Min     TTY   Process
8609   1518439     20262550 392    50.56    50.40    50.32    0     iosd

Cisco says, when there is no ARP entry:

When CEF cannot locate a valid adjacency for a destination prefix, it punts the packets to the CPU for ARP resolution and, in turn, for completion of the adjacency. In rare cases, the adjacency persists in an incomplete state. For example, if the ARP table already lists a particular host, then punting it to the process level does not trigger an ARP.

This can be determined by looking for L3 Glean:

4500x-switch#show platform cpu packet statistics | inc Glean
L3 Glean                    2283852361     12913     14103     11978       8839
L3 Glean                     767303902      7732      8469      7606       6976

If the L3 Glean is high, the packets are getting punted to the CPU for processing.  As of 2016/03/24 we are checking to see if this is a bug.  This can be used as a DoS attack avenue.   

No comments:

Post a Comment