UW-Madison Napster Traffic Measurement

This document contains information on how institutions can monitor Napster traffic, based upon what we have done at the University of Wisconsin - Madison. It is available here: http://net.doit.wisc.edu/data/Napster/

Dave Plonka

Introduction

Let's start with a bit of eye candy. Below is an example graph of our Campus' traffic. The sample was done on Thursday, March 9 2000, from about 5 PM through midnight.

As you can see, during that time period, our Campus Napster users were responsible for an amount of traffic which rivals both general web traffic (HTTP) and file transfer traffic (FTP).

Table of Contents

UW-Madison Napster Traffic Measurement
    Introduction
    Table of Contents
    Background
    To Block or not to Block?
    Implementation
    Summary
    References

Background

Institutions such as ours have struggled in trying to characterize how our network connections to the outside world are being used. Recently, a new kind of "Killer Application" has arrived - these are collaborative or sharing applications that turn our internet users, who have traditionally been primarily consumers of information into providers. In other words, these applications such as Napster [ http://www.napster.com/ ] effectively convert those user's machines into internet servers, doling out lots of content, such as mp3 music to other Napster users.

Napster, as you may well know, has recently gotten lots of attention. Some institutions such as ISPs, Universities and Colleges have attempted to restrict Napster use at their institution, claiming that it was using as much as 40, 50, and 60 percent of their available bandwidth to the rest of the world.

Our network engineers who are responsible for the Campus backbone network and work closely with the state-wide network WiscNet have wondered how Napster is effecting our bandwidth utilization. How do we know what load this sort of application imposes when all we have is here-say? Where's the real data that shows the impact of this application? That is what we've sought to investigate.

To Block or not to Block?

If you're curious about our current policy regarding Napster, please see a news item about Napster, http://www.doit.wisc.edu/news99/newsitem.cfm?filename=292, or our Appropriate Use Policy, http://www.doit.wisc.edu/appro.htm.

I've enhanced our existing real-time traffic monitoring system, FlowScan, in an effort to determine Napster's impact without interfering with it. That is, I didn't want to introduce a heisenbug [ http://www.tuxedo.org/~esr/jargon/html/entry/heisenbug.html ] into the measurement and analysis, by interrupting the users ability to use a given application, then jump to the conclusion that the bandwidth saved during the experiment must represent the load induced by that application. That method of measurement is questionable since interfering with the services changes the users', and therefore the application's, behavior.

Regardless of the impact of applications such as Napster, blocking them is likely only to be a very short term solution. Interfering with Napster will immediately cause (at least) two things:

  1. A subset of the users to scramble to work around the blockade. There is a lot of info available on the web about how to work around blocking Napster using proxies and alternative servers. There are alternative, free implementations of the server-side.

    Basically, by irritating our users, we could accelerate the forth-coming bandwidth utilization problems that these sorts of applications will cause. Our campus, with big pipes to the outside world will make us a prime candidate for these "rogue" alternative servers.
     

  2. Our ability to monitor and track Napster will be stymied because users will change from the default configuration which helps us to more confidently measure the application's effect.
Quality-of-Service and differentiated services features in the networking infrastructure are probably how network administrators and engineers will be dealing with such impact in the near future. Another option is to charge for the bandwidth used, but until we provide users with a way to visualize their imact, misunderstandings will make this option not likely to be well received. That is, its too easy to use lots of bandwidth "accidentally".

Implementation

Identifying Napster traffic is a bit difficult. It is something of a moving target because the identity of the servers can change, and the user is able to reconfigure the application from its default ports. We hypothesize that similar peer-to-peer sharing applications will become similarly popular in the near future, and Napster provides us with lots of data to examine in which to determine how to better understand how to prepare for this change in internet usage. As an example, I will give a simplified overview of how our FlowScan tool identifies Napster traffic and estimates its impact on our network.

FlowScan watches for traffic from Campus machines to the Napster.com servers. When it sees this, it remembers the identities (IP addresses) of the server in the outside world and the client on our campus, and also remembers the time at which the traffic was observed. We call the identified server a "NapServer" and the campus machine with which it interracted a "NapUser". Subsequently, when FlowScan sees traffic between a machine in the outside world and a NapUser it concludes, based on some rules about protocols ports and packet sizes, whether or not that machine is a remote NapUser, and therefore that this traffic represents the passing of data between Napster application users. Byte and packet counts for both the traffic between NapServer and NapUser, and between a NapUser and remote NapUser are maintained, and graphed in near-real-time and presented on our NetStats web site. After a period of time (e.g. 30 minutes), NapUsers are "retired" if they have not since talked with a NapServer. This reduces the likelihood of FlowScan misidentifying unrelated traffic as Napster traffic.

If you or your institution would like to monitor your network traffic similarly, you to implement it using our tools. The software components are all Free Software and can be obtained using the reference URLs below. Other than obtaining the software components, this method requires you to be using a Cisco router at your network border. Of course you'll need someone with software and network engineering talent, but what else is new.

Summary

In the near future, dealing with applications like Napster that have a dramatically different profile of bandwidth utilization will be a challenge for network engineers. New network Quality-of-Service features and wire-rate layer 3 and layer 4 switching have the potential to be a big help as we move to prioritizing and differentiating amongst traffic types. Existing options utilizing legacy routers may exist such as intentionally introducing an asymetric route so that outbound traffic travels over over a band. This would mimic the behavior of asymetric consumer offerings such as ADSL and Cable modems which discourage, but do not wholly prevent, end-users from introducing unexpected traffic patterns by introducing servers to the network. For the time being, th advantage would be that the user population will remain information consumers rather than providers which is likely to be the model for which the network infrastructure was designed.

References

  1. Napster, Inc.
    http://www.napster.com
  2. Unofficial Napster FAQ
    http://napster.cjb.net/
  3. OpenNap -- Open Source Napster Server (a Napster application alternative)
    http://opennap.sourceforge.net/
  4. Napster protocol information
    http://david.weekly.org/code/napster.php3
  5. How To Access Napster When It's Blocked
    http://david.weekly.org/code/napster-proxy.php3
  6. A description of Mirc: another "sharing" application
    http://david.weekly.org/writings/mirc.php3
  7. NetStats - UW-Madison's NetStats web page
    http://wwwstats.net.wisc.edu/
  8. FlowScan - UW-Madison's network monitoring tool
    http://net.doit.wisc.edu/~plonka/FlowScan/
  9. CAIDA's network monitoring tool used by FlowScan
    http://www.caida.org/Tools/Cflowd/
  10. RRDTOOL, Round Robin Database used by FlowScan
    http://ee-staff.ethz.ch/~oetiker/webtools/rrdtool/
  11. Cisco's NetFlow feature on which Cflowd relies
    http://www.cisco.com/warp/public/732/netflow/
  12. Free Sotware Philosophy - The Free Software Foundation / GNU Project
    http://www.gnu.org/

Copyright 2000