As the demand on content delivery systems grew, p2p-based architectures emerged as viable and
cost-effective approaches to satisfying such systems' requirements. Given the significant interest
from researchers and Internet users alike, improvements in p2p systems' performance is likely to
have an impact on a large community. Since such architectures need to be highly efficient, highly
scalable, and highly distributed, appropriate design and evaluation of p2p-type systems is fundamental
to the future of distributed systems in general.
This part of my research has focused on the design and evaluation of highly distributed p2p
systems as well as overlay systems. To focus this research further, I have concentrated my efforts
on the following content delivery applications: (a) file downloads, i.e., applications where the entire
content is delivered before it is consumed (e.g., software upgrade downloads); (b) live streaming,
i.e., applications where continuous media (CM) is being viewed as it is being generated (e.g., live
sporting events); and (c) video-on-demand, i.e., applications where CM was recorded at an earlier
time, and it can either be viewed as it is being delivered, or large portions of it can be downloaded
before viewing begins. Many such applications require high QoS in order to be commercially viable
- inability to control the resulting QoS is an important barrier for their wide-spread development and
use (and is the reason commercial p2p systems include dedicated servers).
Many challenges in such systems can be viewed in light of two fundamental issues: (1) incentives:,
i.e., provision of appropriate incentives for peers to contribute their resources such that the
entire system performs well according to metrics appropriate for each application;
and (2) quality-of-service,
i.e., designing protocols and architectures that are amenable to provision of QoS and
performance prediction. My work focuses on gaining insight into these fundamental issues, which
I believe will lead to the development and proliferation of applications that are able to provide high
QoS that is currently lacking. Challenges to being able to address these issues include: (1) dynamic
and heterogeneous nature of p2p and overlay environments; (2) ability to use resources efficiently
with little or no centralized control; (3) design and analysis of models that are able to capture sufficient
aspects of such systems to be able to understand their ``interesting'' behavior and to facilitate
better understanding of how such systems work and where improvements are needed; and (4) design
and analysis of protocols and architectures that are robust to the dynamic and heterogeneous nature
of such environments and that are subsequently able to provide QoS.
In the context of large-scale overlays, I have also continued my efforts on scalable upload applications
over wide-area networks, where we have focused on design of an application layer infrastructure
that improves scalability, security, fault tolerance, and performance of current and future
Internet-based services. To the best of my knowledge, our work was the first to focus on scalability
and efficiency of upload applications.
P2P Systems
Incentives.
We explore incentive provision in the context of BitTorrent (BT) [D12]. Although,
BT ``forces'' peers to share their resources while attempting to complete their downloads (through
the tit-for-tat (TFT) mechanism) there are still opportunities for free-riders (i.e., peers who do not
contribute resources) to download files through: (1) capacity provided by leechers (nodes who have
not completed their download yet) through the optimistic unchokingmechanism (used by leechers to
probe for new downloading opportunities), and (2) capacity provided by seeds (nodes that have completed
their download but continue to contribute their uploading capacity to the system). Sufficient
evidence exists that: (1) opportunities for free-riding hurt the system's performance, and (2) a nonnegligible
number of nodes in BT participate as seeds, which significantly contributes to providing
reasonable performance for the free-riders. Moreover, a leecher's progress through the downloading
process is slower at the beginning (when having few chunks prevents a node from participating in
TFT) and at the end (when it is more difficult to find peers with the few remaining pieces of needed
data). In both cases, seeds can contribute significantly, as they have complete data and are no longer
playing the TFT game. Since, the seeds' behavior is an important factor in facilitating free-riding
and in aiding contributing leechers, our main idea was to explore alternative approaches for seeds to
contribute their upload capacity to the system, with the goal of degrading free-riders' performance
while (if possible) improving other leechers' performance. To the best of our knowledge, no other
study before ours explored this.
Although empirical evidence suggests that many nodes participate in multiple torrents, little
research literature exists on this topic. This can be exploited to increase the number of seeds in
a given torrent, which in turn contributes to (a) helping newly joined nodes ramp up so that they
can become contributing peers faster, (b) helping nodes nearing the end of their downloads find
the last few file chunks faster, and (c) keeping a torrent ``alive'' longer (i.e., making sure that all
file chunks are available in the system). Thus, our work [D8,A4]
focused on (i) what incentives
could be provided for nodes to contribute resources as seeds in a multi-torrent environment, and (ii)
what are the resulting performance consequences, both, for the nodes willing to be seeds and for the
overall system. We found that the current system lacks incentives for nodes to stay around as seeds
in a multi-torrent environment. Motivated by that, we proposed a cross torrent based method and
illustrated, through an extensive performance study, the benefits of such an approach.
QoS/performance prediction.
We have explored the different aspects of QoS and performance
prediction through the use of analytical models and protocol design. For instance, we have developed
a simple, yet extensive analytical model of BT [D7]. To the best of our knowledge, it is the first
(steady state) analytical performance model of a heterogeneous BT system that includes the behavior
of seeds and free-riders (as noted earlier, both are important). The accuracy of our model can be
attributed to including important BT characteristics, namely (i) imperfect clustering in regular (TFTbased)
unchokes and (ii) bias in optimistic unchokes. This accuracy is validated through simulations
and PlanetLab experiments, which also demonstrate the importance of including imperfect clustering
and biased optimistic unchoking in accurately predicting nodes' download times. Our model is
quite extensible and can be used to evaluate future protocol modifications - e.g., we demonstrated
our model's extensibility by explicitly modeling two variations of BT: (a) the well known large view
exploit, and (b) our proposed fix for the large view exploit (the novel seeding approach described
above [D12]).
In a p2p-based streaming system, packets can arrive at a node in any order, but they must be
played back in a specific order (and at a specific rate), corresponding to the original recording rate
of the stream. We can view the subset of the edges (between nodes in a p2p overlay) used for
packet delivery as forming a mesh. The system's performance is largely a function of the algorithms
used to construct and maintain this mesh as well as the algorithms used for scheduling packet delivery
over the mesh. Our work [D6] focused on approaches to constructing such meshes, where the
primary goal was to develop an understanding of two related quantities - playback delay and buffering
requirements. To this end, we explored two approaches to mesh construction and subsequent
streaming, one based on multi-trees and the other based on hypercubes. We used these approaches
to explore the resulting playback delay, buffer space, and communication requirements. In our
work we focused on a ``structured'' approach to mesh construction (i.e., the set of edges used for
delivery of packets was fixed by our algorithms) as compared to an ``unstructured'' approach (i.e.,
the edges used for delivery are determined on a per packet basis, essentially on the fly when the
data is needed). Although unstructured techniques allow the system to more easily adapt to node
churn, existing unstructured approaches to streaming (rather than file downloading) are mostly ``best
effort'', and little exists in the way of formal analysis of resulting QoS prediction. In contrast, an advantage
of our approach is that it has provable quality-of-service (QoS), where we provide analysis
of corresponding performance characteristics.
While many efforts focused on the common challenges faced by live and pre-recorded (Videoon-
Demand) streaming applications, our efforts focused on some of the open questions in designing
p2p-based Video-on-Demand (VoD) systems specifically [D4]. We considered a BT-like VoD system
and focused on: (i) how the lack of load balance, which typically exists in a p2p-based VoD
system, affects performance and what steps can be taken to remedy that, and (ii) is a FCFS approach
to serving requests sufficient or whether a Deadline-Aware Scheduling (DAS) approach can lead to
performance improvements. Motivated by the lack of high QoS in streaming systems, we proposed
several practical schemes aimed at addressing the above stated questions and illustrated the benefits
of our techniques through an extensive performance study.
Multi-path Streaming
We have also been carrying over our efforts on QoS-oriented design from storage systems
[A5,A8,D23,D17]
to delivery of CM data over the Internet. The poor QoS in streaming over the Internet
is partly due to variations in delays, bandwidth limitations, and packet losses. Although streaming
applications can tolerate some missing data, non-recoverable information loss degrades these applications'
QoS. Consequently, a number of application areas have backed away from streaming of
their content over the Internet. Inability to control the resulting visual and auditory quality of the
streamed presentation is an important reason for such a trend.
We believe that this trend can be reversed. To this end, our work focused on exploring high
quality streaming through the exploitation of multiple paths existing in the network (between a set
of senders and a receiver). By high quality, we mean with significant bandwidth requirements, of
relatively long duration, and without information loss or hiccups in data delivery.
Our goal was to design an application-level approach, i.e., one that can be easily deployed over
the Internet, without requirements for support from/modifications to the lower layers of the network.
Our focus was largely on providing a fundamental understanding of the benefits of using multiple paths
to deliver CM data destined for a particular receiver. (This data is fragmented into packets
and the different packets take alternate routes to the receiver, e.g., by streaming the data through an
overlay of relays distributed over a wide area network. As each packet is sent on only one of the
paths, our approach does not increase the overall workload on the network.)
Existence of multiple paths (with disjoint bottlenecks) includes the following potential benefits:
(a) reduction in correlations between consecutive packet losses; (b) increased throughput; (c) improved
robustness and ability to adjust to variations in congestion patterns and failures. Our work
focused on the loss characteristics aspects. Using a Gilbert model (GM), we gave
[A14] an analytical
characterization of when a multi-path (MP) approach is beneficial, as compared to a single
path (SP) approach. Our results indicated that: (1) in general, MP streaming exhibits better loss
characteristics than SP streaming, (2) use of an erasure code may not necessarily improve data loss
characteristics in the case of SP streaming, while MP streaming (with or without use of an erasure
code) can improve data loss characteristics, and (3) lag1-autocorrelation of MP streaming is usually
closer to zero than that of SP streaming, which should also result in a higher viewing quality of the
received data. It was also indicated that the average error burst length ofMP streaming is statistically
shorter than that of SP streaming.
Given these benefits, another important question is how to distribute the load among multiple
paths so as to optimize viewing quality. (The setting and possible adaptation of what fraction of
data is sent on each path can be done by the receiver, based on its perceived QoS.) We studied
[D20]
the problem of assigning CM data to the multiple paths, according to the path characteristics,
such that a certain specified performance metric is optimized. This was done under the GM, while
considering appropriate optimization objectives, with the goal of improving the perceptual quality
of the streamed media. Determining and gaining insight into suitable optimization objectives is
also non-trivial and important (e.g., optimizing the lag-1 autocorrelation may result in higher loss
rates observed at the receiver, when the paths are not homogeneous). Since there is a fundamental
tradeoff between the frequency of losses and the corresponding loss correlations, it is natural to
consider optimization objectives that encompass both metrics.
A great deal of our work was done under a ``conventional Gilbert model''. A limitation of using
such a model is that the loss process of a path is independent of the bandwidth requirements of the
streaming application. Our later effort [A9]
proposed the use of a functional Gilbert model (FGM) as
a more general approach to characterizing the bursty loss nature of a path as well as its dependency
on an application's bandwidth requirements.
We verified our analytical and simulation results under real world conditions, by building a p2pbased
multi-path streaming prototype [A11] and using it for streaming video experiments between
UMD, CUHK, and USC. The choice of a p2p-based architecture was motivated by the parallelism
between simultaneous downloads in p2p systems and multi-path streaming.
Scalable Infrastructure forWide-area Uploads
Hotspots are a major obstacle to achieving scalability in the Internet; they are usually caused by
either high demand for some data or high demand for a certain service. At the application layer,
hotspot problems have traditionally been dealt with using some combination of increasing capacity,
spreading the load over time and/or space, and changing the workload. These classes of solutions
have been studied in the context of one-to-many, many-to-many, and one-to-one type communication.
However, to the best of our knowledge ours was the first effort
[D37,D33,B1,A13],
on making
applications using many-to-one communication scalable and efficient. This corresponds to an important
class of upload applications (e.g., submission of income tax forms to IRS, submission of
papers to conferences, Internet-based storage, certain aspects of cloud computing, and many more).
The main focus of our early work was on scalable infrastructure design for wide-area upload
applications, where we proposed Bistro [D37],
an architecture which employs an overlay of (untrusted)
hosts for improving the efficiency and scalability of uploads. Our basic idea was based on
the observations that (a) existence of hotspots in many upload applications is due to approaching
deadlines and long transfer times and (b) what is actually required by many upload applications
is an assurance that specific data was submitted before a specific time, and that the transfer of the
data needs to be done in a timely fashion, but does not have to occur by that deadline. Thus, our
approach was to break the original deadline-driven upload problem into: (1) a real-time timestamp
subproblem, where we ensure that the data is timestamped and that the data cannot be subsequently
tampered with; (2) a low latency commit subproblem, where the data goes ``somewhere'' (to an intermediary)
and the user is assured that the data is safely and securely ``on its way'' to its destination;
and (3) a timely data transfer subproblem, which can be carefully planned (and coordinated with
other uploads) and results in data delivery to the original destination.
Bistro's ability to share an infrastructure, such as an infrastructure of proxies, between a variety
of wide-area applications has clear advantages over the more traditional solutions.
In [B1] we
conducted a performance study which demonstrated the potential performance gains of the Bistro
framework as well as provided insight into the general upload problem. Since confidentiality of data
as well as other security issues are especially important in upload applications and in our solution
where we introduced untrusted (public) intermediaries (i.e., bistros), we also developed
[D33] a
secure data transfer protocol within the Bistro framework, which not only ensures the privacy and
integrity of the data but also takes scalability considerations into account.
In later efforts we focused on performance and reliability improvements to the basic Bistro design
along the subproblems outlined above
[D19,D21,A10,D27,A1,D29,D30].
For instance, we obtained [A10,D27]
interesting results in the context of the timely data transfer subproblem (described
above), which can be viewed more generally as the problem of collecting a large amount of
data from a set of hosts to a single destination in a wide-area network. Often, due to congestion conditions,
the paths chosen by the network may have poor throughput. By choosing an alternate route
at the application level (e.g., by using relays in an overlay), we may be able to obtain substantially
faster completion time. This data collection problem is a non-trivial one because the issue is not
only to avoid congested link(s), but to devise a coordinated transfer schedule which would afford
maximum possible utilization of available network resources. In our work, we made no assumptions
about knowledge of the topology of the network or the capacity available on individual links
of the network, i.e., we only used end-to-end information. We used a network flow based solution,
utilizing time-expanded graphs, which gave us an optimal data transfer schedule with respect to the
makespan metric, given the constraints on knowledge of the topology and capacity of the network.
Experimentally, we established that the lack of knowledge of the paths, provided by the network
to send data, are not a significant barrier. Our approach to computing coordinated data collection
schedules resulted in significant performance improvements. In general, our approach can be used
for solving arbitrary data movement problems over the Internet, and we used the Bistro platform to
illustrate one such application.
Another direction where we considered a general problem, motivated by the Bistro architecture,
is that of digital signature schemes' performance
[A1,D29]. Digital signature schemes are widely
used in many applications; however, the performance of an Internet server computing digital signatures
online is limited by the high cost of modular arithmetic. One way to improve the server's
performance is to reduce the number of computed digital signatures by combining a set of documents
into a batch (in a smart way) and signing each batch only once. This approach could reduce
the demand on the CPU but requires additional network bandwidth (for sending extra information
to clients). Our work provided a framework (using semi-Markov models) for studying as well as
analyzing performance of a variety of online digital signature batching schemes. We extensively
validated our framework's accuracy and effectiveness. Our results showed that substantial computational
benefits can be obtained from batching without significant increases in the amount of
additional information that needs to be sent to the clients.
|