DescriptionTechnical advances are leading to a pervasive computational ecosystem that integrates computing infrastructures with embedded sensors and actuators, and giving rise to a new paradigm for monitoring, understanding, and managing natural and engineered systems -- one that is information/data-driven.
This research investigates a programming system that can support such end-to-end sensor-based dynamic data-driven applications. Specifically, it enables these applications at two levels. First, it provides programming abstractions for integrating sensor systems with computational models for scientific and engineering processes and with other application components in an end-to-end experiment. Second, it provides programming abstractions and system software support for developing in-network data processing mechanisms. The former supports complex querying of the sensor system, while the latter enables development of in-network data processing mechanisms such as aggregation, adaptive interpolation and assimilation, both via semantically meaningful abstractions. For the latter, we explore the temporal and spatial correlation of sensor measurements in the targeted application domains to tradeoff between the complexity of coordination among sensor clusters and the savings that result from having fewer sensors for in-network processing, while maintaining an acceptable error threshold. Experimental results show that the proposed in-network mechanisms can facilitate the efficient usage of constraint resources and satisfy data requirement in the presence of dynamics and uncertainty.
The research presented in this thesis is evaluated using two application scenarios: (1) the management and optimization of an instrumented oil field and (2) the management and optimization of an instrumented data center. In the first scenario, the programming abstractions and systems software solutions enable end-to-end management processes for detecting and tracking reservoir changes, assimilating and inverting data for determining reservoir properties, and providing feedback to enhance temporal and spatial resolutions and track other specific processes in the subsurface. The overall goal is to ensure near optimal operation of the reservoir in terms of profitability, safety and/or environmental impact. In the second scenario, the autonomic instrumented data center management system addresses power consumption, heat generation and cooling requirements of the data center, which are critical concerns especially as the scales of these computing environments grow. Experimental results show that the provided programming system reduces overheads while achieving near optimal and timely management and control in both application scenarios.