NextGig Systems, Inc. - Network Connectivity & Test Solutions

How to Build a Huge Virtualized SpyNet

Gigamon - Intelligent Data Access Networking

Called by some as the ultimate networking challenge of their career, building InteropNet from scratch in four days is an awesome feat. Here's how a large temporary enterprise network is constructed. Using a virtualized SpyNet, multiple tools are added one at a time so that no critical packet is ever lost and no tool is ever oversubscribed.

As much as CES symbolizes consumer electronics, today Interop (previously NetWorld+Interop) represents state-of-the-art enterprise data communication and networking. Interop, however, is much more than a mega-show to display promising technology. For the past 20 years, Interop has been the event-of-choice for network engineers and CIOs who travel from around the world to witness what works and what doesn't, and a place for equipment vendors to show off.

The show was originally named "Interop" precisely to portray the "interop"erability demanded for the latest generation of technology--non-trivial in the beginning since equipment from different vendors often didn't play well with each other. Over the years, companies big and small have succeeded or failed for the most part based on their performance at Interop. Enduring technology emerged and large sums of money was gained or lost. In retrospect, Interop uniquely and single-handedly contributed to the successful realization of multi-vendor enterprise networks that we take for granted today.

Visitors traveling to Interop expect to see newly released products working under fire. The vehicle at Interop for the competing vendors to show off performance and reliability is the mission-critical InteropNet (or what was once called ShowNet, Event Network, eNet) touted as the world's largest temporary enterprise network. For years, hundreds of handpicked volunteers (the NOC team) come together to take on the ultimate networking challenge of their career, to build InteropNet from scratch in four days. As one veteran of the process said, "So cutting edge, we are still bleeding years later."

InteropNet and SpyNet
InteropNet is actually three networks in one. The primary network is the production network that supports hundreds of vendor booths and classrooms, starting with two redundant links to the Internet, two routers, two firewalls and two 10G core switches. Connectivity to the rest of the show is accomplished through a fault tolerant 10G fiber ring, stitching together eight different wiring racks distributed across the show floor. Each rack contains additional switching gears to provide connectivity to the individual booths.

The other two networks are less visible but no less important. They are "out-of-band" in the sense that they provide alternative data paths unobtrusive to the production (in-band) traffic. They are also completely sealed off from the outside world for absolute security.

The first out-of-band network is the out-of-band "Management" network, otherwise known as Access Ether, which allows network engineers to quickly communicate with various pieces of networking gears even if the production network is severely under attack and performance is highly compromised. The second out-of-band network, which is the focus of this "How-to" article, is the out-of-band "Monitoring" network, otherwise known as the Spy Network or SpyNet.

This secondary network allows the NOC team to backhaul "replica" traffic through a completely separate overlay infrastructure, giving them great flexibility and fidelity in non-intrusive network monitoring. SpyNet enables changes and customization of monitoring traffic on-the-fly, without the need to alter the configuration of the production network during the three-day show and eliminates any possibility of overloading the already well-utilized production network.

In the early years of InteropNet, SpyNet was just another physical network, a parallel cable plant such that engineers can have a separate media-level link to any part of the network. SpyNet was deemed necessary because time-to-resolution must be near immediate and also because InteropNet is physically spread out, thereby allowing the team to see MAC layer errors and traffic from anywhere without leaving the NOC.

SpyNet was a revolutionary concept when it was introduced and today it is accepted as an industry Best Practice. Although with modern switches, one can create VLANs to accomplish similar functions, SpyNet still provides far more flexibility in unobtrusive data access and moreover, since the monitoring traffic does not need to go through the same trunk links as the user traffic, it is considered highly fault tolerant (an important feature for mission-critical networks such as the InteropNet).

SpyNet and virtualization
Today's SpyNet is no longer just a physical network. In fact, SpyNet has evolved together with advances made in the in-band network. Technology deployed in SpyNet has become every bit as complex as the switches/routers being monitored. In short, SpyNet has been completely virtualized.

There are two reasons why virtualization is important for SpyNet. One is simply that the production network itself has been virtualized. In the beginning of the industry, enterprise network topology was described as large broadcast domains separated by a few routed connections. Such flat architecture made it very easy to diagnose with a traditional protocol analyzer. All one needed to do was to plug a sniffer into a hub. Today's network is entirely switched and it is a converged medium meaning that multiple applications share the same physical conduit, making it exceedingly difficult to monitor and troubleshoot.

SpyNet evolved so that today's SpyNet is no longer just an automated patch panel. While any-to-any cross connect is still an important function, SpyNet must now provide many packet-aware functionalities (such as many-to-any aggregation, any-to-many multicasting, packet filtering, flow-mapping and load-sharing) so that instead of simple physical connections, tools are now connected virtually to the SpyNet.

Instead of receiving packets from a single access point such as a SPAN port or a tap, tools now receive packets that belong to a multi-link trunk (i.e., to get a Big Pipe view), a VLAN or a "flow" (e.g., all packets that possess the same IP address pairs and application port number so that they belong to a single end-to-end transaction or VoIP conversation).

SpyNet must deliver any desired payloads by collecting traffic from different physical parts of a redundant, asymmetrically routed and load-balanced network. In summary, SpyNet is now a "virtualization" layer between the production network and the out-of-band monitoring tools such that not just any packets, but any logical slice of arbitrarily aggregated traffic can be delivered from anywhere to any tool, at any time.

The second reason for the need to virtualize SpyNet is the fact that there are now many monitoring tools, each requiring custom data access, each competing for scarce resources such as SPAN ports and taps, and each having a potential mismatch between available processing capability and bandwidth requirement.

This year at Interop Las Vegas 2006, five vendors provided fourteen pieces of monitoring equipment for the SpyNet and one vendor provided the data-access switch, which is the infrastructure building block for the virtualized SpyNet.

Figure 1 shows the actual equipment that was deployed as part of the SpyNet at Interop Las Vegas 2006 which include the data-access switch from Gigamon, troubleshooting tools from Fluke, security tools from Extreme and Juniper, application tools from Network Physics, forensics tools from Network General and optimization tool from Internap.

Figure 1. Equipment at Interop 2006

As with the original SpyNet, troubleshooting is still the most important reason for out-of-band monitoring. Multiple protocol analyzers are deployed. The data-access switch is configured on the fly to quickly deploy any given tool to any given spot on the network whenever trouble arises during the show (including equipment deployed in remote locations on the off-show floor). In addition, traffic from four different tap points (eight streams total, using internal tap modules of the data-access switch) are aggregated such that tools can monitor traffic from both before and after the redundant firewalls. Finally, hardware filters must be deployed if there is a need to inspect only VoIP traffic or to drill down on a particular traffic type (e.g., HTTP) or a particular VLAN.

There are two kinds of security tools. One prevents external attacks such as IDS (Intrusion Detection Systems) and the other prevents internal abuses (e.g., worms and viruses inadvertently delivered from portable computers carried by traveling salesmen). One IDS is connected such that it receives traffic from the two taps inside of the firewall (as a last-line-of-defense for the firewall) and the other IDS receives traffic from the 10G mirroring port of one of the two core switches (using the data-access switch to downshift from 10G to 1G).

The internal security tools (there are four) are more interested in internal abuses and receive traffic from the 1G mirroring ports of the eight access switches inside the eight racks. Using the data-access switch to aggregate and to load-share the aggregated traffic, multiple 1G tools can each receive a logical slice of the multi-Gigabit traffic such that each tool receives nominally 25% of the total traffic originating from any two racks (each of which is assigned its own subnet).

Figure 2 shows the NOC, which houses the equipment. The two identical racks on the left and center contain the routers (blue), Gigamon data-access switches (orange), firewalls (blue), and 10G core switches (purple). The rack on the right is the SpyNet rack, which contains fourteen monitoring tools: Juniper (blue), Internap (blue), Network Physics (black), Extreme (purple), Fluke (blue) and Network General (black and green.).

Figure 2. Housing the equipment at Interop 2006

Application response time monitoring is surprisingly important. Inappropriate use of Bit Torrent or similar P2P applications consumes unreasonable amounts of bandwidth, leading to an unjustified complaint that the network is slow. A number of application probes are deployed at the show, using the data-access switch to aggregate and to logically map from critical points throughout the network.

A new class of troubleshooting tools is deployed at the show which has off-line data storage capability allowing the NOC engineers to replay past events and attacks for forensic analysis (much like a TiVo). The data-access switch is used to customize connectivity for each of the three data recording tools (including downshifting from 10G to 1G and packet filtering).

With InternopNet connected to the Internet using two redundant high-speed links from two different providers, there is a need for an optimization tool whose primary function is to balance traffic between the two links. To avoid unacceptable and costly downtime, the tool ensures that InteropNet is available even when one provider is completely down or performing poorly. Since this is a 10G tool, the data-access switch aggregates from multiple 1G links and up-shifts to 10G to provide custom connectivity.

Figure 3 shows the port assignment and the diverse connectivity between the various network access points (taps and SPAN ports) and the connecting tools, which together completely consume the 40 ports available on the two data-access switches (interconnected using a 10G GigaLINK to provide a contiguous switch fabric). For simplicity, the bit-mask packet filters are not shown which are used to customize traffic for each tool.

Figure 3. Port assignment and connectivity

In summary, at Interop Las Vegas 2006, a number of sponsors provided a collection of the Best Practice Best-of-Breed monitoring solutions to SpyNet, each delivering a specific and complementary function to protect, analyze and optimize the mission-critical InteropNet. But what's different about this year is that SpyNet has evolved into a virtualized network, providing a virtualized "data-socket" for multiple monitoring tool and allowing each tool to receive a customized logical slice of the total traffic that is suitable for their monitoring needs.

As with any virtualized network, SpyNet can accommodate moves, adds and changes without requiring a truck roll or any physical changes or impact to the mission critical production network. Moreover, with a virtualized SpyNet, multiple monitoring tools performing the same or dissimilar functions can be added one at a time to perfectly match the growing bandwidth requirement, each time getting a finer slice of the total traffic, such that no critical packet is ever lost and no tool is ever oversubscribed.

About Gigamon Systems

Founded in 2003 by six veterans of network monitoring and telecommunications equipment companies, Gigamon Systems is the inventor and leading provider of Data-Access Switches. Its flagship product, GigaVUE®, can multicast packets from one span or tap to many tools to solve the span port sharing problem. It also can aggregate and intelligently filter packets from many spans or taps to one or multiple tools to solve the problem of monitoring flows across complex mesh topologies and virtual networks. GigaVUE® facilitates unobtrusive parallel tool deployment with network-wide coverage, significantly reducing customers’ capital budgets and yielding immediate ROI benefits.

For more information about Gigamon Data Access Switches please contact us here.


Questions? Call  1-805-277-2400