TY - JOUR TI - Distributed placement and resource orchestration of real-time edge computing applications DO - https://doi.org/doi:10.7282/t3-ryxr-6g54 PY - 2021 AB - The recent emergence of a broad class of deep learning based augmented and virtual reality applications motivates the need for real-time mobile cloud services. These real-time, mobile applications involve intensive computation over large data sets, and are generally required to provide low end-to-end latency for acceptable quality-of-experience at the end-user. Limited battery life, computation and storage capacity constraints inherent to mobile devices mean that application execution must be offloaded to cloud servers, which then return processed results to the mobile devices through the Internet. When cloud servers reside in remote data centers, end-to-end communication may translate into long delays characteristic of multi-hops transmissions over the Internet. Moving cloud computing to the edge of a network has helped to lessen these otherwise unacceptable delays while leveraging the benefits of a high-performance cloud. While this improvement is significant, there are several technical challenges that need to be addressed in order to achieve low end-to-end latency. In the thesis, we aim to address the following problems. First, how to efficiently distribute the real-time application between the mobile device, edge servers and data center to meet the latency constraints? Second, as the edge cloud architecture is inherently distributed and heterogeneous, how to perform resource allocation and task orchestration in a latency constrained design? Finally, existing cloud computing solutions often assume there exists a dedicated and powerful server, to which an entire job can be offloaded. In reality, we may not be able to find such a server, which motivates an investigation of techniques for use of multiple less powerful edge servers to achieve a parallel job offloading. In the first part of the thesis, we take virtual reality massively multiplayer online games (VR MMOGs) as a driving example and design a hybrid service architecture that achieves a good distribution of workload between the mobile devices, edge clouds, core cloud for low latency and global user scalability,We also propose an efficient service placement algorithm based on a Markov decision process to dynamically place a user’s gaming service on edge clouds. This dynamic service placement can help to further reduce the latency under user mobility. In the second part of this thesis, we present the design and implementation of a latency-aware edge computing platform, aiming to minimize the end-to-end latency for edge applications. The proposed platform is built on Apache Storm, an open source distributed computing framework, and consists of multiple edge servers with heterogeneous computation (including both GPUs and CPUs) and networking resources. Central to our platform is an orchestration framework that breaks down an edge application into Storm tasks as defined by a directed acyclic graph (DAG) and then maps these tasks onto heterogeneous edge servers for efficient execution. In the last part of this thesis, we take a closer look at these computing intensive deep learning-based computer vision jobs. We propose to partition the video frame and offload the partial inference tasks to multiple servers for parallel processing. This work presents the design and implementation of Elf, a framework to accelerate the mobile deep vision applications with any server provisioning through parallel offloading. Elf employs a recurrent region proposal prediction algorithm, a region proposal centric frame partitioning, and a multi-offloading scheme. KW - Edge computing KW - Electrical and Computer Engineering LA - English ER -