Show Me the Data

One of my friends recently pointed me to this post about network data. The author states that one of the things he will miss the most about working at Google is the access to the tremendous amount of data that the company collects.

Although I have not worked at Google and can only imagine the treasure trove their employees must have, I have also spent time with lots of sensitive data during my time at AT&T Research Labs.  At AT&T, we had—and researchers still presumably have—access to a font of data, ranging from router configurations to routing table dumps to traffic statistics of all kinds.  I found having direct access to this kind of data tremendously valuable: it allowed me to “get my hands dirty” and play with data as I explored interesting questions that might be hiding in the data itself.  During that summer, I developed a taste for working on real, operational problems.

Unfortunately, when one retreats to the ivory towers, one cannot bring the data along for the ride.  Sitting back at my desk at MIT, I realized there were a lot of problems with network configuration management and wanted to build tools to help network operators run their networks better.  One of these tools was the “router configuration checker” (rcc), which has been downloaded and used by hundreds of ISPs to check their routing configurations for various kinds of errors.  The road to developing this tool was tricky: it required knowing a lot about how network operators configure their networks, and more importantly direct access to network configurations on which to debug the tool.  I found myself in a catch-22 situation: I wanted to develop a tool that was useful for operators, but I needed operators to give me data to develop the tool in the first place.

My most useful mentor at this juncture was Randy Bush, a research-friendly operator who told me something along the following lines: Everyone wants data, but nobody knows what they’re going to do with it once they get it.  Help the operators solve a useful problem, and they will give you data.

This advice could not have been more sage.

I went to meetings of the North American Network Operators Group (NANOG) and talked about the basic checks I had managed to bootstrap into some scripts using data I had from MIT and a couple other smaller networks (basically, enough to test that the tool worked on Cisco and Juniper configurations).  At NANOG, I met a lot of operators who seemed interested in the tool and were willing to help—often they would not provide me with their configurations, but they would run the tool for me and tell me the output (and whether or not the output made sense).  Guy Tal was another person who I owe a lot of gratitude for his patience in this regard.  Sometimes, I got lucky and even got a hold of some configurations to stare at.

Before I knew it, I had a tool that could run on large Internet Service Provider (ISP) configurations and give operators meaningful information about their networks, and hundreds of ISPs were using the tool.  And, I think that when I gave my job talk, people from other areas may not have understood the details of “BGP”, or “route oscillations”, or “route hijacks”, but they certainly understood that ISPs were actually using the tool.

We applied the same approach when we started working on spam filtering.  We wrote an initial paper that studied the network-level behavior of spammers with some data we were able to collect at a local “spam trap” on the MIT campus (more on that project in a later post).  The visibility of that work (and its unique approach, which spawned a lot of follow-on work) allowed us to connect with people in industry who were working on spam filtering, had real problems that needed solving, and had data (and, equally importantly, expertise) to help us think about the problems and solutions more clearly.

In these projects (as well as other more recent ones), I see a pattern in how one can get access to “real data”, even in academia.  Roughly, here is some advice:

  • Have a clear, practical problem or question in mind. Do not simply ask for data.  Everyone asks for data.  A much more select set is actually capable of doing something useful with it.  Demonstrate that you have given some thought to questions you want to answer, and think about whether anyone else might be interested in those questions.  Importantly, think about whether the person you are asking for data might be interested in what you have to offer.
  • Be prepared to work with imperfect data. You may not get exactly the data you would like.  For example, the router configurations or traffic traces might be partially anonymized.  You may only get metadata about email messages, as opposed to full payloads.  (And so on.)  Your initial reaction might be to think that all is lost without the “perfect dataset”.  This is rarely the case!  Think about how you can either adjust your model, or adapt your approach (or even the question itself) with imperfect data.
  • Be prepared to operate blindly. In many cases, operators (or other researchers) cannot give you raw data that they have access to; often, data may be sensitive, or protected by non-disclosure agreements.  However, these people can sometimes run analysis on the data for you, if you are nice to them, and if you write the analysis code in a way that they can easily run your scripts.
  • Bring something to the table. This goes back to Randy Bush’s point. If you make yourself useful to operators (or others with data), they will want to work with you—if you are asking an interesting question or providing something useful, they might be just as interested in the answers as you are.

There is much more to say about networking research and data.  Sometimes it is simply not possible to get the data one needs to solve interesting research problems (e.g., pricing data is very difficult to obtain).  Still, I think as networking researchers, we should be first looking for interesting problems and then looking for data that can help us solve those problems; too often, we operate in reverse, like the drunk who looks for his keys under the lamppost because it is brighter where the light is shining.  I’ll say more about this in a later post.

Networking Meets Cloud Computing (Or, “How I Learned to Stop Worrying and Love GENI”)

If you build it, will they come? In Field of Dreams, Ray Kinsella is confronted in his cornfield by a whisper that says, “If you build it, he will come,” which Ray believes refers to building a baseball field in the middle of a cornfield that will play host to Shoeless Joe and members of the 1919 Black Sox.  Only Ray can see the players initially, leading others to tell him that he should simply rip out the baseball field and replant his corn crop.  Eventually, other people see the players, too, and decide that keeping the baseball field might not be such a bad idea after all.

I can’t help but wonder if  this scenario might have an analogy to the Global Environment for Network Innovations (GENI) effort, sponsored by the National Science Foundation.   The GENI project seeks to build a worldwide network testbed to allow Internet researchers to design and test new network architectures and protocols.  The project has many moving parts, and I won’t survey all of those here.  A salient feature of GENI, though, is that it funds infrastructure prototyping and development, but does not directly fund research on that infrastructure.   One of the most interesting challenges for me has been—and still is—how to couple projects that build infrastructure with projects that directly use that infrastructure to develop interesting new technologies and perform cutting-edge research.

Can prototyping spawn new research? This is, in its essence, the bet that I think GENI is placing: If we build a new experimental environment for networking innovation, the hope is that researchers will come use it.  Can this work? I think the answer is probably “yes”, but it is too soon to know the answer to this question in this context.  Instead, I would like to talk about how our GENI projects have spawned new research—and new educational material—here at Georgia Tech.

The Prototype: Connectivity for Virtual Networks. One of the the GENI-funded projects is called the “BGP Multiplexer” or, simply the “BGP Mux”.  If that sounds obscure, then perhaps you can already begin to understand the challenges we face. Simply put, the BGP Mux is like a proxy for Internet connectivity for virtual networks (BGP is the protocol that connects Internet Service Providers to one another).  The basic idea is that a developer or network researcher might build a virtual network (e.g., on the GENI testbed) and want to connect that network to the rest of the Internet, so that his or her experiment could attract real users.  You can read more about it on the GENI project Web page.

Some people are probably familiar with the concept of virtualization, or creating “virtual” resources (memory, servers, hardware, etc.) based on some shared physical substrate.  Virtual machines are now commonplace; virtual networks, however, are less so.  We started building a Virtual Network Infrastructure (VINI) in 2006.  The main motivation for VINI was to allow experimenters to build virtual networks on a shared physical testbed.  One of the big challenges was connecting these virtual networks to the rest of the Internet.  This is the problem that the BGP Mux solves.

Providing Internet connectivity to virtual networks is perhaps an interesting problem within the context of building a research testbed, but, in my view, it lacked broader research impact.  Effectively, we were building a “hammer” that was useful for building a testbed, but I wanted to find a “nail” that was solving a real problem, could be published, and could also be used in the classroom.  This was not easy.

The Research: Networking for Cloud Computing.  To broaden the applicability of what we had built, essentially we had to find a “nail” that might need fast, flexible way for setting up and tearing down Internet connections.   Cloud computing applications seemed like a natural fit: services on Amazon’s EC2, for example, might want to control inbound and outbound traffic with their customers.  They might want to do this for cost or performance reasons, for example.  Today, this is difficult.   When you rent servers in EC2, you have no control over how traffic comes over the Internet to reach those servers—if you want paths with less delay or otherwise better performance, you are out of luck.  Using the hammer that we had built with the BGP Mux, however, this was much easier: instead of solving a problem in terms of “virtual networks for researchers” (something only a small community might care about), we were solving the same problem, but in terms of users of EC2.   Essentially, the BGP Mux offers EC2 “tenants” the ability to control their own network routing.  This capability is now deployed in five locations and we are planning to expand its footprint.  A paper on this technology will appear at the USENIX Annual Technical Conference in June. We welcome any other networks that would like to help us out with this deployment (i.e., if you can offer us upstream connectivity at another location, we would like to talk to you!).

Education: Transit Portal in the Classroom. I’ve been teaching a course called “Next-Generation Networking”, a course on Future Internet Architectures that I plan to discuss at more length on this blog at some point.  Typical networking courses are not as “hands on” as I would prefer: I, for one, graduated from college without ever even seeing a router in person, let alone configuring one.  I wanted networking students to have more “street cred”—they should be able to say, for example, that they’ve configured routers on a real, running network that’s connected to the Internet and routing real traffic.  This sounds like lunacy.  Who would think that students could play “network operator for a day”?  It just sounds too dangerous to have students play around on live networks with real equipment.   But with virtual networking and the BGP Mux, it’s possible.  I recently assigned a project in this course that had students build virtual networks, connect them to the Internet, and control inbound and outbound traffic using real routing protocols.  Seeing students configure networks and “speak BGP with the rest of the Internet” was one of my proudest days in the classroom.  You can see the assignment and videos of these demos if you’d like to learn more.

Prototyping and research.  Will the researchers come? Our own GENI prototyping efforts have been an exercise in “working backwards” from solution to networking research problem.  I have found that exercise rewarding, if somewhat counter to my usual way of thinking about research (i.e., seek out the important problems first, then find the right hammer).  I think now the larger community will face this challenge, on a much broader scale: Once we have GENI, what will we do with it?  Some areas that seem promising include deployment of secure network protocols and services (our current protocols are known to be insecure), better support for mobility (the current Internet does not support mobility very well), new network configuration paradigms (networks of all kinds, from the transit backbone to the home, are much too hard to configure), and new ways of pricing and provisioning networks (today’s markets for Internet connectivity are far too rigid).  We have  just finished work on a large NSF proposal on Future Internet Architectures that I think will be able to make use of the infrastructure that we and others are building; in the coming months, I think we’ll have much more to say (and much more to see) on this topic.

A New Window for Networking

It’s an exciting time to be working in communications networks.  Opportunities abound for innovation and impact, in areas ranging from applications, to network operations and management, to network security, and even to the infrastructure and protocols itself.

When I was interviewing for jobs as networking faculty about five years ago, one of the most common questions I heard was, “How do you hope to effect any impact as a researcher when the major router vendors and standards bodies effectively hold the cards to innovation?”   I have always had a taste for solving practical problems with an eye towards fundamentals.  My dissertation work, for example, was on deriving correctness properties for Internet routing, and developing a tool, router configuration checker (rcc), to help network operators check that their routing configurations actually satisfied those properties.  The theoretical aspects of the work were fun, but the real impact was that people could actually use the tool; I still get regular requests for rcc today from both operators and various networking companies who want to perform route prediction.

This question about impact cut right to the core of what I think was a crisis of confidence for the field.  Much of the research seemed to be focused on performance tuning and protocol tweaks.  Big architectural ideas were confined to paper design, because there was simply no way to evaluate them.  Short of interacting directly with operators and developing tools that they could use, it seemed to me that truly bringing about innovation was rather difficult.

Much has happened in five years, however; there are seemingly countless exciting opportunities in networking; there are more exciting problems than there is time to work on them.  There are many areas where exciting innovation is happening, and it is becoming feasible to effect fundamental change to the network’s architecture and protocols.   I think several trends are responsible for this wealth of new opportunities:

  • Network security has come to the forefront.  The rise of spam, botnets, phishing, and cybercrime over the past few years cannot be ignored.  By some estimates, as much as 95% of all email is spam.  In a Global Survey by Deloitte, nearly half of the companies surveyed reported an internal security breach, a third of which resulted from viruses or malware.
  • Enterprise, campus, and data-center networks are facing a wealth of new problems, ranging from access control to rate limiting and prioritization to performance troubleshooting.  I interact regularly with the Georgia Tech campus network operators, as a source of inspiration for problems to study.  One of my main takeaways from that interaction is that today’s network configuration is complex, baroque, and low-level—far too much so for the high-level tasks that they wish to perform.  This makes these networks difficult to evolve and debug.
  • Network infrastructure is becoming increasingly flexible, agile, and programmable.  It used to be the case that network devices were closed, and difficult to modify aside from the configuration parameters they exposed.  Recent developments, however, are changing the game.  The OpenFlow project at Stanford University makes it much more tenable to write software programs to control the entire network at a higher level of abstraction, and provides more direct control over network behavior, thus potentially providing operators easier ways to control and debug their network.
  • Networking is increasingly coming to blows with policy.  The collision of networking and policy is certainly not new, but it is increasingly coming to the forefront, with front-page items such as network neutrality and Internet censorship.  As the two areas continue on this crash course, it is certainly worth thinking about the respective roles that policy and technology play with respect to each of these problems.
  • Networking increasingly entails direct interaction with people of varied technical backgrounds.  It used to be that a “home network” consisted of a computer and a modem.  Now, home networks comprise a wide range of devices, including media servers, game consoles, music streaming appliances, and so forth.  The increasing complexity of these networks makes each and every one of us a network operator, whether we like it or not.  The need to make networks simpler, more secure, and easier to manage has never been more acute.

The networking field continues to face new problems, which also opens the field to “hammers” from a variety of different areas, ranging from economics to machine learning to human-computer interaction.  One of my colleagues often says that networking is a domain that draws on many disciplines.  One of the fun things about the field is that it allows one to learn a little about a lot of other disciplines as well.  I have had a lot of fun—and learned a lot—working at many of these boundaries: machine learning, economics, architecture, security, and signal processing, to name a few.

The theme of my blog will be problems and topics that relate to network management, operations, security, and architecture.  I plan to write about my own (and my students’) research, current events as they relate to networking, and interesting problem areas and solutions that draw on multiple disciplines.  I will start in the next few posts by touching on each of the bullets above.