Networking / Beginners

Choosing to Collect Statistics

Successful network operation is not just about configuring devices, but also requires constant monitoring of the status of the links and nodes that make up the network to detect faults, congestion, and network "hot spots." For ISPs to achieve contracted levels of service, they must be continuously aware of the load within their network and must discover node and link failures as quickly as possible.

SNMP provides notifications through trap messages to alert the management station when key events occur, although it is of the nature of networking failures that they may themselves prevent the delivery of any notification messages. SNMP also gives access to counters that provide basic statistical information about the traffic flows through a specific interface or device, and a management station may read these counters repeatedly to get a view of the change in network usage.

It should be borne in mind that the process of collecting network statistics in real time may have a detrimental effect on the operation of the network. This is not quite Heisenberg's Uncertainty Principle, but repeated requests to read data at many nodes can cause a lot of additional traffic and may congest the network around the central location at which the data are accumulated. For this reason, network statistics should be collected in a very structured way for day-to-day operation, focusing on entry and exit points to networks rather than on every link and node within the entire network. This has the benefit of policing Internet work agreements as well as checking to see which external links are close to their limits.

At the same time, multiple collection points can be used within the network to share the load of statistics collection. These intermediate collection points serve to coalesce the data sets into a single useful group of statistics before forwarding the information to the central collection point. In particular, since some statistics are used for billing, some for fault detection, some for long-term planning, and some for service maintenance, the intermediate collection points can filter the statistics and send information to the appropriate consumer while still providing just a single point of contact for each device.

Although SNMP may provide access to the necessary statistical information, it is not the best choice for network monitoring because it is request-response based. The client (or collection point) must issue read requests to the server (the device being monitored) in order to read the information. Further, the MIB modules are structured for wide configuration reporting rather than pure statistics gathering. These two factors mean that SNMP introduces a considerable overhead if it is used for this purpose.

As an alternative, the NetFlow architecture was devised by Cisco and is now being considered for standardization by the IETF. NetFlow is based on a series of record formats specifically designed to contain statistical information and to allow devices to report bulk data to their collection points. An important consideration is that the maintenance and dispatch of the NetFlow records should have the smallest possible impact on the ability of the device to forward data.

The NetFlow records can be collected by the device and sent periodically (based on a timer or a threshold) to the collection point, generally using a transfer protocol such as FTP. An intermediate collection point can operate on the data and then send them onward. Since NetFlow is not an IP protocol, we will not discuss it further.

[Previous] [Contents] [Next]