DescriptionAdvances in technology have provided ways to measure driving behavior. Recently, this technology has been applied to usage-based automotive insurance. Policy holders may opt-in to monitoring for the hope of reduced insurance premiums. Although some of these monitoring devices are based upon GPS information and offer no location privacy protections, several companies are aware of the privacy concerns and therefore measure only speed data. However, does collecting the speed data really preserve privacy? Our work investigates how much location information we can actually obtain from the speed data and why the speed data should also be protected against malicious third parties. In this thesis, we present our algorithm to track drivers’ locations when only speed data and starting locations are known. The starting locations are mostly home addresses that insurance companies know. The algorithm fits the speed data to a trajectory path on a map and evaluates which path should be the actual driving route. To demonstrate the algorithm’s real-world applicability, we evaluated its performance with driving datasets from New Jersey and Seattle, Washington, representing suburban and urban areas.
We present the Elastic Pathing algorithm to track drivers, the enhanced version of Elastic Pathing algorithm with several optimizations, and a final machine learning approach by learning how a speed pattern can indicate the driving direction. Our Elastic Pathing algorithm can estimate destinations with error within 250 meters for 17% traces and within 500 meters for 24% traces in the New Jersey dataset (254 traces). For the Seattle dataset (691 traces), we similarly estimated destinations with error within 250 and 500 meters for 16% and 28% of the traces respectively. At the end, based on the challenge from previous approach, we designed and implemented the machine learning approach for the current New
Jersey dataset in order to achieve higher accuracy. With machine learning, our algorithm was able to estimate destinations with error within 250 and 500 meters for 25% and 30% of traces respectively in our New Jersey dataset. This work shows that speed data enable a substantial breach of privacy.