Laboratory of Networked Systems

Computer Science Department, Yale University

The Internet has become a critical infrastructure of our information-based society. Our dependency on this infrastructure has become much like our dependency on other basic infrastructures such as the world's power grids and the global transportation systems. Failures of the Internet infrastructure or major applications running on top of it can have an enormous cost with potentially serious consequences. As an example, during one recent event on January 9, 2006, the Sprint backbone network was partitioned with just two link failures. This partition made news headlines and led to serious consequences for Sprint, millions of Sprint/Nextel users, and many corporate networks that relied on the carrier for their critical operations. However, our study showed that there was enough redundancy in the Internet to have avoided this partition, if all resources had been used efficiently.

The central question driving much of the research at LANS is the following: What are the guiding principles and practical techniques for achieving robust and efficient computer networks?

Our research methodology is to integrate rigorous analysis with careful system design, practical implementation, and whenever possible large-scale field tests with real users. The objective of rigorous analysis is to reveal and derive the most fundamental guiding principles. As an example, instead of an ad-hoc system design, our P4P framework derives the interfaces between networks and network applications through rigorous primal-dual optimization decomposition. The objective of system implementation and large-scale field tests is to ground the principles in the real world. Some of our systems have undergone extremely large-scale field tests (e.g., P4P has been test-deployed with millions of real users at five of the largest Internet service providers in the world); some of our tools have been used by multiple other groups around the world (e.g., TORTE and Network Localization).

We study network robustness, efficiency, and fairness from multiple perspectives, including peer-to-peer (P2P) networks, traffic engineering, network management, wireless and mobile networks, network measurements, sensor networks, congestion control, and network security. By investigating a variety of networked systems but with a central question, we seek to determine unifying themes that rise above the specifics of particular systems.

Some major on-going projects at LANS are

P4P: Provider Portal for Network Applications

Current Members

  • Richard Alimi
  • Ye Wang

The P4P project is motivated by P2P applications, whose network-oblivious, application-oriented routing leads to significant network inefficiency and interacts poorly with network adaptations (see our SIGCOMM 2003 paper). Recent conflicts between P2P applications and Internet service providers (ISPs) led to news headlines and FCC hearings. Technically, the emergence of P2P applications reveals a fundamental deficiency in the Internet architecture: traditional Internet assumes that it is mainly the networks who are responsible for efficiency, but such emerging applications can have tremendous flexibility in how data are communicated; thus, they should be an integral part of Internet resource optimization as well.

The objective of the P4P project is to address this deficiency in the Internet architecture. It is a framework that enables ISPs and network applications to work jointly and cooperatively to optimize network resource consumption and to accelerate network applications. It introduces a set of novel concepts (e.g., "my-Internet" view, p-Distance, and application optimization service), a set of APIs between networks and applications, and a set of algorithms to achieve the aforementioned objective. Its design addresses multiple challenges including heterogeneity of the Internet (i.e., there are diverse network objectives and diverse application objectives), privacy concerns by networks and applications on revealing their information, and scalability for an Internet-scale system. It handles both intradomain and interdomain (e.g., multihoming optimization; SIGCOMM 2004) network optimization.

The P4P system is fully implemented, and test-deployed at five of the largest Internet service providers in the world. From February 2008 to August 2008, multiple rounds of field tests were conducted, involving multiple millions of real users and the five aforementioned Internet service providers. Results from these field tests demonstrated the effectiveness of P4P: it can significantly improve both application performance and network efficiency. For example, on the Verizon network using 1.25 million real Pando users, P4P reduced network cost measured by bandwidth-distance product by a factor of 6, when we conducted our first large-scale field test in February 2008. In our field test in July/August 2008 involving four large ISPs, P4P again achieved excellent results. For example, on the Comcast network, P4P improved application speed by more than 80%, and reduced incoming Internet traffic by an average of 80% at peering points. These results were presented by Comcast at IETF 73, to motivate the IETF effort of Application-Layer Traffic Optimization (ALTO).

We are deploying P4P more broadly in the global Internet. Any extension to the Internet infrastructure faces substantial deployment challenges. So far we have made significant progress on the ISP side and we are focusing more on the P2P application side. If you are interested, please drop us a note.

For more technical details on P4P, please check out our SIGCOMM 2008 paper available at the publication page.

NCS: Network Configuration Studio

Current Members

  • Richard Alimi
  • Ye Wang

A major contributing factor to the costs and lack of robustness in the Internet is errors in network management. A recent survey found that more than 80% of IT budgets are allocated towards maintaining the status quo, meaning that four times as many resources are spent at maintaining existing networks as are spent at developing and deploying new services. Another survey found that more than 60% of network downtime is due to human error, where configuration errors are the largest contributor to failures and repair time. Thus, a major step towards achieving robust computer networks is significant reduction of configuration management errors and easing of configuration management tasks. This is the objective of the Network Configuration Studio project.

The intellectual foundation of NCS is a simple, yet powerful concept we call shadow configurations (please see our SIGCOMM 2008): a network operator specifies two configurations in its network: one real (current) and one shadow (alternate). The shadow configuration provides a virtual "world" in which potentially disruptive network activities (e.g., routing re-convergence, testing a new configuration, and debugging) take place. The shadow configuration runs on the actual network in parallel with the real configuration so that reality can be introduced into it in an efficient and controlled manner. An additional transaction capability is introduced to allow non-disruptive switching between these two configurations (e.g., for deployment of the validated shadow configuration or online debugging of a failed real configuration). Additional new capabilities that NCS can provide include configuration analysis and diagnosis based on multiple configurations, automated testing, and online, interactive network management debugging.

A research prototype of NCS based on Linux is fully implemented and available to others under the BSD license. Extensive evaluations of our implementation show that shadow configurations can be implemented efficiently, with only 12 additional lines of code on the kernel's forwarding fast path for packets not using packet cancellation, and no code changes to routing processes. The FIB memory increase to support both real and shadow configurations is less than 35% for the worst-case router under a variety of shadow configurations for a large US tier-1 ISP; the average is much smaller, less than 7%. At run time, our shadow-enabled forwarding engine under heavy traffic has low CPU overhead with a shadow configuration installed.

For more technical details on shadow configurations, please check out our SIGCOMM 2008 paper available at the publication page.

TORTE: Towards Optimized and Robust Traffic Engineering

Member

  • Hao Wang
The objective of this project is to develop a set of techniques to handle both traffic variability and topology variability in Internet traffic engineering. Internet traffic engineering is a major tool used by Internet networks to compute how traffic should be distributed in an Internet network.

In particular, the TORTE system implements COPE, the principle of common-case optimization with penalty envelop (see our SIGCOMM 2006 paper). This principle argues that it is robust but inefficient to design a network for the worst cases; on the other hand, designing for the common cases may pose significant robustness risk, when burst or unexpected events happen. COPE proposes the framework of designing for the common cases but providing a penalty envelope to protect against unexpected events.

The TORTE system also supports REIN, the novel idea of reliability as an interdomain service (see our SIGCOMM 2007 paper). REIN is based on the observation that a key challenge in achieving a high-level of reliability is the high cost of obtaining redundancy, which includes both the cost of obtaining diversity of physical connectivity and the cost of over-provisioning of bandwidth to carry traffic originally passing through any failed equipment. Thus, it generally requires significant investments by a network operator to tolerate failures well. On the other hand, large IP networks in the US cover the same geographic regions and place their routers at similar sites. Thus, there is the possibility of multiple IP networks pooling resources together, just as airlines or insurance pool resources together. REIN propose techniques that can be incrementally deployed in the Internet and may improve the redundancy of IP networks at low cost.

All of the techniques in TORTE are fully implemented and a toolset distribution is available. The toolset can directly output Cisco and Juniper configuration files.

For more technical details on TORTE, please check out our SIGCOMM 2006 and SIGCOMM 2007 papers available at the publication page.

FITE: Foundation for Internet Traffic Engineering

Members

  • Hao Wang
  • Haiyong Xie
The objective of the FITE project is to further develop our fundamental understanding on interdomain traffic engineering. It started with the realization that previous studies on interdomain traffic engineering were largely ad hoc and we lack fundamental understanding on this crucially important subject. Our first progress (IBC 2006) is the recognition that interdomain traffic engineering is essentially a social choice problem (i.e. aggregating individual domain preferences to form a collective preference). This novel perspective immediately reveals fundamental tradeoffs that must be made on achievable properties of interdomain traffic engineering. Integrating the descriptive, protocol-independent social-choice model with the notion of rational route selection (IEEE Network Magazine, ICNP 2005 ingress) to model distributed traffic engineering protocols, this research not only demonstrates the potential instability of the prevailing interdomain traffic engineering practices but also establishes provably-stable interdomain traffic engineering guidelines (ICNP 2005 egress).

For more technical details on FITE, please check out our IBC 2006 and the two ICNP 2005 papers available at the publication page.

Incentive-Compatible Wireless Networks

Members

  • Sheng Zhong

The objective of the project is to introduce incentive in the forwarding and routing of wireless networks. The first design of the project is Sprite (INFOCOM 2003), which provides a rigorous design on considering incentive for data forwarding in wireless networks. Being among the earliest and being the first to provide provable incentive-compatibility properties, this design is widely cited, with 403 citations as of October 2008. The follow-up to Sprite is Corsac (MOBICOM 2005), which addresses both forwarding and routing. The design of Corsac is covered as a full chapter in a popular graduate textbook ``Security and Cooperation in Wireless Networks,'' co-authored by Levente Buttyan and Jean-Pierre Hubaux, and published by Cambridge University Press. In our most recent work (MOBICOM 2008), we extended our design to handle opportunistic routing. A Linux implementation of the most-recent design is available, using which we conducted extensive evaluations.

For more technical details on this project, please check out our INFOCOM 2003, MOBICOM 2003, and MOBICOM 2008 papers available at the publication page.

muNet: Towards High Capacity and Incentive-compatible Wireless Networks

Members

  • Richard Alimi
This is a joint project with Erran Li, Harish Viswanathan, and Ramachandran Ramjee. The objective of the muNet project is to improve the throughput of wireless networks. The muNet project develops techniques for in-network generation of composite packets to integrate coding at two different layers of the protocol stack (INFOCOM 2008): XOR-based network coding and physical layer superposition coding (MOBICOM 2007). In a typical wireless mesh network when more traffic is between the clients and access points, the average throughput improvement of our technique can be 324%, while there can be little gain (less than 10%) if network coding alone is used. Our technique is implemented using GNU Radio and available.

For more technical details on this project, please check out our INFOCOM 2008 and MOBICOM 2007 papers available at the publication page.

Network Localization

Members

  • David Goldenberg
This is a joint project with the groups of James Aspnes, Brian Anderson, A. Stephen Morse, and Walter Whiteley.

Network localization studies how to determine the physical positions of the nodes in a network. Knowing the positions of the network nodes is a foundational service in sensor networks and ubiquitous computing, since the nodes need to know where they are in order to carry out their tasks. Although the importance of localization is very well recognized in the sensor and ubiquitous computing communities, the fundamental questions in localization were not addressed before our work.

The first fundamental question is the conditions under which each node can determine its position uniquely (INFOCOM 2004). Also, even if the position of a node can be uniquely determined, what is the computational complexity (ALGOSENSORS 2004, INFOCOM 2004)? Furthermore, what is the performance of localization in a typical network environment, for typical applications (INFOCOM 2005, MOBICOM 2006)? Through unexpected applications of rigidity theory, random graph theory, and complexity theory, this project provided elegant answers to all of these foundational questions. A toolset containing Java implementation of all of our algorithms is available.

For more technical details on this project, please check out our INFOCOM 2004, MOBICOM 2006, and WINET 2007 papers available at the publication page.

End Host Fairness and Rate Control

For related papers, please see here.

Security and Privacy

For related papers, please see here.

LANS at Yale University. Initial CSS template from free css templates. Last modified at $Id$