Description
TitleContent caching, retrieval and dissemination in networks with storage
Date Created2011
Other Date2011-05 (degree)
Extentxii, 103 p. : ill.
DescriptionThe overwhelming use of today’s network is for an endpoint to acquire a named content file. As
a result, efficient content discovery and dissemination are becoming one of the key challenges
for the design of future Internet protocols. With significant advances in the technology areas of data storage, storage capacities have increased dramatically while the price has been dropping very fast. Thus it is valid to assume that each router on the Internet can cache content files that pass by and reply to content requests with its local copies. In this thesis, We firstly introduce the In-Network Caching framework.
The content dissemination consists of two phases. The first phase is content discovery, which is the service provided by Content Name Resolution. Through CNRS a requester discovers the location(s) of the desired content files. We present the hybrid CNRS architecture, in which CNRS servers form a hierarchy, with national, regional, institutional CNRS servers from top to
bottom. On each level, each CNRS server is responsible of monitoring the caching locations
of a predesignated group of content files. While there is a miss at the lower level, the CNRS request will be forwarded to the higher level server, which is similar to the hierarchical web caching. Following the content discovery process, the second phase is content retrieval, in which the endpoint sends a request towards the hosting server, and the requested content file will be
returned to the requester as the outcome. Cache-n-Capture is the baseline caching scheme,
in which each enroute router independently decides whether or not to cache passing content
files. When a request is routed through the router later, the router can ”capture” the request and reply back with the cached copy of the content, instead of forwarding the request to the original hosting sites. We advocate two enhancement techniques: one is content broadcast, which lets each router advertise its cached content files to the immediate neighbors so that a router
can direct subsequent content requests to nearby caching nodes, and the other is coordinated
caching, which lets neighboring routers implicitly coordinate their caching decisions so that
the collective cache utilization can be improved. Through detailed simulations, we show that
these two caching techniques can significantly outperform other approaches. This performance
gain can be achieved with a small communication overhead by limiting content broadcast and coordination only within one hop. Also, we demonstrate that the storage requirement of the discussed schemes is reasonable. Next we develop a mathematical model for CC to optimize the average content retrieval latency
with limited storage on each router. We propose Sequential Reassignment (SR) algorithm
to solve the optimization problem. In order to implement the proposed scheme in a distributed fashion, we use exponential smoothing based estimator. We compare the average content retrieval latencies and the average saved hops per enroute hit of the proposed distributed caching schemes with two other common cache replacement policies. We study the impact of cache size
and the locality parameter, as well as the plateau factor for workload generation. We show that our proposed scheme can always provide significant performance improvement under various settings. Next we reconsider the optimization problem with content broadcast enhanced on each router. We formulate a different mathematical model to obtain the maximum benefit of CB.We propose the distributed Pseudo-Gradient algorithm. We compare the performance of the proposed caching scheme with the two replacement policies under the same simulation settings as
in CC. The results show that Distributed-PG achieves performance improvement while keeping
the communication overhead of content broadcast much smaller than traditional replacement
policies. In the final part of this work, we investigate a gateway centric method for efficient content
caching and routing. In this method, a content copy can get cached within an autonomous system (AS) while it is routed towards the destination, and thus subsequent requests can be satisfied faster. The caching location within an AS is determined by the gateway node, through
a hashing function. We discuss the two alternatives of this method: one with uniform caching level for every content file, and the other with varying caching levels. Through simulation studies, we show that the gateway-controlled caching method can greatly improve content
retrieval latencies over traditional solutions, such as three levels of caches placed by Hierarchical
Caching, and mirror servers deployed by Content Distribution Networks (CDNs). This gateway-controlled caching method also outperforms a baseline caching method CC. Additionally, we study and compare the detailed performances of the two alternatives of this method. Finally we develop a mathematical model whose objective is to guide the gateway to make
optimal caching decisions within an AS and to achieve the minimum average content retrieval latency. We provide a greedy caching algorithm to solve the optimization problem and show its superiority over random caching.
NotePh.D.
NoteIncludes bibliographical references
NoteIncludes vita
Noteby Lijun Dong
Genretheses, ETD doctoral
Languageeng
CollectionGraduate School - New Brunswick Electronic Theses and Dissertations
Organization NameRutgers, The State University of New Jersey
RightsThe author owns the copyright to this work.