Abstract
(type = abstract)
Recently, with an increasing number of people living in cities, it introduces new challenges in human mobility such as traffic congestion and energy consumption, which are caused by dense human population distribution, unbalanced infrastructure deployment, or insufficient understanding of travel demand. Thus, it is essential to improve the mobility of urban residents on a daily basis, which can be achieved by accurately modeling human mobility with ubiquitous urban sensing data from heterogeneous urban sensing systems, e.g., on-board GPS systems including taxis, buses, personal vehicles, and portable device systems such as cellphones. Existing studies modeling human mobility are mostly built upon single systems. However, people in cities take multiple transportation modalities on a daily basis, where a single sensing system limits a comprehensive understanding and modeling of human mobility.
In the dissertation, we aim to model human mobility at the metropolitan scale, by utilizing spatio-temporal data of heterogeneous sensing systems already collected for billing or management purposes. We design, implement and evaluate a data-driven framework named extit{urbanSense} with three modules for human mobility modeling (e.g., travel distance, travel time, travel speed): (i) a extit{sensing} module to collect and preprocess human mobility sensing data from 8 urban sensing systems crossing 3 domains (i.e., transportation, communication, and payment); (ii) a extit{measurement} module where we present a measurement work named SysRep to measure the data bias of urban sensing systems for human mobility modeling. In SysRep, we quantify the data bias of urban sensing systems as representativeness of sensing systems. We analyzed potential reasons for representativeness and found the representativeness is highly correlated with contextual factors such as population, mobility, and Points of Interest. We further design a correction model to improve the representativeness of sensing systems. The evaluation results show the proposed correction model can improve the representativeness of singe systems by 45\%. (iii) a extit{prediction} module to model human mobility from heterogeneous urban sensing systems. In particular, we present two works: one work named MultiCell for real-time population modeling and the other work named MAC for travel time prediction. In MultiCell, we design two techniques to model real-time population from multiple cellular networks: a spatial alignment technique to align different spatial partitions into a uniform spatial partition; a co-training technique to learn the relation between active cellphone users of different networks and population distribution simultaneously. MultiCell is implemented with Call Detail Records (CDR) of three major networks in China in the same city covering 100\% cellphone users. The evaluation results prove the effectiveness of MultiCell by reducing the modeling error by 27\% compared with the start-of-the-art models. In MAC, we decompose travel time of multiple transportation systems (i.e., subway, taxi, bus, and personal vehicle) into fine-grained travel time based on different travel stages (e.g., walking, riding, waiting time). Moreover, we design a time-series model based on Long Short-Term Memory (LSTM) architecture to predict the travel delay under the impact of different anomalies. We implement and evaluate MAC with data collected from 37 thousand vehicles and 5 million smart cards. The results show MAC reduces the prediction error by 31\% compared with state-of-the-art methods. Finally, we discuss some lessons learned and potential applications of our framework.