Networking / Beginners

Collecting Data for Analysis

As we discuss the various offerings for data analysis, a key consideration is how these products collect their data. The methods that are used will have a significant impact not only for what metrics are available for analysis, but also the analysis host's placement within the network and resource requirements. What follows is a brief explanation of what data-collection methods are most common, along with some of their strengths and weaknesses.

Sniffing data is one of the simplest methods of collecting data.Without any special configuration, sniffing the data means listening to all network traffic as it passes through the segment the host system is connected to. This technique is typically the most robust because in sniffing the traffic, the host has the capability to see every single packet. What is done with all this data is up to the analysis engine, but the focus here is that you are not grabbing select pieces of information, you are collecting all the data, and then sorting through it. This method will be more processor intensive than most other methods, especially if there is a high volume of traffic.

This method also requires precise placement of the host that will be collecting data, because it has no way to see the data unless it passes through the segment the host is on. Because of this, the physical location of the data collection system will likely be dictated by the network topology and location of traffic you wish to analyze. Besides resource requirements, the biggest drawback to this method is that it will collect data at the network level, with no regard for product-specific metrics. Although some analysis platforms can attempt to remedy this and perform analysis on some higher-level information contained in the packet, you will not be able to get the same level of upper-layer information as you will with the other methods.

SNMP is a protocol that is designed specifically to accommodate the management of network-enabled devices. Although this management can include making changes, in a data analysis context, SNMP is really only used to retrieve information. When used this way, a network host requests certain information from the SNMP-enabled device, which then sends the desired metrics in response. Alternatively, the SNMP device can be configured to send the metrics as a sort of alarm when they surpass a configured threshold. The information collected is limited in that it is very focused.You can only ask a device for the specific statistics that it supports. While sniffing collects network layer data, SNMP can collect higher-layer, product-specific data that sniffing would not easily be able to gather. An example of product-specific counters is the currentAnonymousUsers and currentNonAnonymousUsers values from an IIS 6 server. Attempting to build in the logic for a sniffer to track each connection to the IIS server and monitor if that connection used authentication would be very burdensome. Instead, SNMP can provide these metrics directly from the IIS server, which is already tracking these things.

SNMP can also be a chatty protocol in a large environment, contributing to network congestion. In a small environment this may not be an issue, but it's something to be aware of. The primary benefit that SNMP has going for it is that you do not need to place your data collector in the path of the data. You can place the system anywhere and then it will reach out and poll the devices (using a Get) for the desired data points. You can also configure an SNMP-enabled device to send the metrics to a collector when they reach a preconfigured threshold (via a TRAP). SNMP and sniffing provide different information, which enables the two to complement each other's capabilities. Both SNMP and sniffing will require forethought and planning to implement mainly due to the fact that they each collect their statistics differently.

NetFlow is a specially designed protocol for collecting network traffic statistics. NetFlow is primarily supported on Cisco devices, but some other manufacturers implement similar technologies, which exhibit varying levels of interoperability. NetFlow is similar in behavior to SNMP traps in that once a NetFlow-enabled device has been configured, it will then send traffic statistics back to the data collector.The difference is that while SNMP targets very specific metrics that must be supported by the SNMP device, NetFlow targets a very small subset of network traffic data. NetFlow gathers information based on source and destination IP address, source and destination port number, the protocol being used, the type of service settings, and the device interface.These metrics lend themselves to gathering data on bandwidth utilization and network top talkers.This may sound like just what the doctor ordered; however, NetFlow is not supported on all devices, particularly the more economical models.This may mean that NetFlow is a less viable option for data collection in a small networking environment. If you do have network devices that can support it, a little research would be advisable to see if you can take advantage of NetFlow data.You can read more about NetFlow from here: www.cisco.com/en/ US/products/ps6601/products_ios_protocol_group_home.html.

NOTE RMON stands for remote monitoring, which is yet another network management protocol. RMON is a standard, described in RFC2819, which uses SNMP for its underlying functionality and an extensive set of new MIB objects for its data collection. Because it uses SNMP, RMON is vendor neutral, and RMON also takes steps to reduce network traffic where possible. While RMON support on enterprise class network analysis devices is good, it is virtually non-existent on free network analysis solutions.

[Previous] [Contents] [Next]