I will say this simply and clearly so that it cannot be misunderstood:
STANDARD 'SINGLE-PING' SHOULD NOT BE USED TO PROVE THE FOLLOWING:
- ROUTING PROBLEMS
- LATENCY
- PACKET LOSS
Reasons why you cannot trust ICMP Ping:
- ROUTING: PING IS END TO END. It therefore cannot
be used for ROUTING.
PING reveals nothing regarding the intermediate devices. ICMP is part of IP which means it is unaware of the switches, bridges or hubs in the path to the destination, all of which have their own propagation delay (read 'latency'). Some PING implementations contain a record route function, but record route (ping -r) only stores 9 hops. Traceroute can go much farther and is a much better tool for troubleshooting routing.
- ALL: Platform differences. PC's running MS-Windows,
Unix machines, and Routers all handle ICMP and PING packets
differently.
This
difference between platforms introduces delays that
does not occur with ordinary TCP or UDP data. TCP and UDP
are treated much more uniformly between the above
platforms. PING
is an ancient tool. It was written for small (less than 4
hops) heterogeneous LAN environments. PING expects
all hosts to handle ICMP
identically and they do not. Because PING shows round trip
results, you have no way to know which device or wire in
the path is at fault
for your
problem.
- ALL: PING does not, BY ITSELF, identify the host causing
the problem.
There are cases where a failed PING is a normal response.
If you PING www.yahoo.com, and you think you see a problem
at
Yahoo,
you have no way
to
know
what
the
cause
is without
having foreknowledge of how the website and the entire Internet
path between you is configured. Always run additional tests
(traceroute, pathping, pathchar etc.).
A
device in
the middle of the path between
you
and Yahoo
might
be failing or overutilized, making it appear
that Yahoo is dropping packets when they are not.
It's also possible the network is working perfectly and Yahoo
really IS dropping all the packets. As I have repeatedly
said, you have no way to know.
DON'T BOTHER YOUR ISP OR ADMINISTRATOR unless you have the results of more tests.
- LATENCY/LOSS: Queuing and QoS.
Routers can implement queuing strategies, forcing them to handle ICMP differently from TCP and UDP. This queuing causes them to behave in a way that is contrary to the specifications for ICMP, thereby invalidating any results PING (which is an ICMP service) might generate. Devices providing Quality of Service functions (switches, routers or servers) may also handle ICMP in a way that differs from the Internet Standards and specifications in order to optimize availability for TCP and UDP traffic. A QoS device might be programmed to drop 80% of all ICMP regardless of how much TCP or UDP traffic there is currently.
- LATENCY/LOSS: RATE LIMITS. A host may have an artificial
rate limit, or access-list imposed to reduce the effect
of a
possible
future denial of service attack. This will artificially
drop only the ICMP packets, and leave the TCP and UDP packets
untouched. TCP and UDP flows will be unaffected, i.e. 100%
of the TCP and UDP packets will still get through, even though
there is 100% loss seen with ICMP.
- LATENCY/LOSS: BASELINE DEPENDENCIES
PING return time results have no meaning unless there is prior performance data to compare it to. Most network administrators fail to do a 24-hour baseline performance evaluation of their own network before they buy bandwidth and after it is installed. For example: the customer's LAN administrator buys a T3, but fails to check whether the core router that receives the T3 can handle another 45Mbps. Ethernet (100 Mbps) in a bridged environment (hubs) typically maxes out at 66 Mbps (up to 88 in switched environments). If the Ethernet is at 50Mbps, the link is actually at ~75% utilization. The Ethernet simply cannot absorb another 45 Mbps, so the Internet appears to choke, but its really the LAN that's the cause of the problem. If the administrator had performed baselining and flooded the Ethernet links to maximum (or read any documentation on ethernet and its limits) he would have known that he only had 16Mbps to work with on the LAN and upgraded the LAN before complaining to his ISP. Note that it's also possible to swamp an older or underpowered router with too much traffic.
- LATENCY/LOSS: Local Network Issues
Momentary 'glitches' in performance are normal occurrences on every network. This is yet another reason for performing extensive baselining and performing extensive testing before reporting a problem. In networks running OSPF, the entire network experiences latency every time the update timer ticks down to zero and the network is flooded with a large number of OSPF updates. Good baselining and network planning will help to avoid this, but keep in mind that PING can do nothing to identify the source of the OSPF problem because the traffic is coming from all routers on the network and the pause is also caused by the routers updating their routing tables, not solely the high traffic loads. Any PING run in that situation will get totally random, unpredictable and therefore useless results. Again, PING really shouldn't be used for latency.
- LATENCY/LOSS: Bottlenecks (the politics of bad
network design)
It is very common for links between Internet providers to be overutilized and cause a bottleneck. This is caused by the political problem of ISP's wrangling over who is the bigger ISP and who pays whom for the peering connection. When each provider's engineers use PING, the link appears congested in the other guy's side, so they both point the finger at the other company leaving the poor customer in the middle. There may be no physical failure, there may be nothing to be repaired. All equipment is functioning normally, there simply isn't enough capacity. A bad PING result here is useless in this case and won't get anything fixed until the politics get resolved.
SUMMARY: You can never tell if any of items above applies to your situation, or what their effects might be, therefore any results you might get from ICMP PING are always suspect, and cannot be trusted.
You cannot rely on a single run of 4 PINGs as an absolute indicator of real-time packet loss or latency. FURTHERMORE, if you have not done extensive baselining under NORMAL conditions, you have no basis of comparison, rendering any results you might obtain useless.
Unfortunately, PING is one of the few tools that is available on all platforms. It's the old hammer and nail problem. Since all computers come with PING, most of the less knowledgeable computer folks resort to using PING because it's the only tool they have or in most cases, know how to use.
Also, just knowing how to run a PING does not guarantee that you will understand the results it reports back. The feedback from PING is deceptively simple.
You want good performance data? Want reliable information on uptime and availability for services and devices? Get a decent network monitoring package that monitors services, utilizes SNMP and doesn't rely on pings. Make sure it includes a UDP and TCP performance and throughput tool. Use the throughput test to determine your ACTUAL loss or latency.
--InetDaemon