Fusing wireless communication with visual sensing: from sensor sharing to localization

Liu, Hansi

doi:doi:10.7282/t3-7f4s-v331

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

Fusing wireless communication with visual sensing: from sensor sharing to localization

PDF

PDF format is widely accepted and good for printing.

Plug-in required

PDF-1(20.58 MB)

Citation & Export

View Usage Statistics

Staff View

Citation & Export
Hide

Simple citation

Liu, Hansi. Fusing wireless communication with visual sensing: from sensor sharing to localization. Retrieved from https://doi.org/doi:10.7282/t3-7f4s-v331

Export

Click here for information about Citation Management Tools at Rutgers.

Statistics
Hide

Description

TitleFusing wireless communication with visual sensing: from sensor sharing to localization

NameLiu, Hansi (author); Dana, Kristin (chair); Gruteser, Marco (member); Dana, Kristin (member); Chen, Yingying (member); Raychaudhuri, Dipankar (member); Lu, Hongsheng (member); Rutgers University; School of Graduate Studies

Date Created2022

Other Date2022-10 (degree)

SubjectComputer engineering, Wireless localization, Wireless communication systems, Wireless sensor networks, Visual sensing

Extent1 online resource (128 pages) : illustrations

DescriptionVisual and wireless sensing, two popular sensing modalities in multi-modal systems, have complementary characteristics. While providing richer and more accurate spatial measurements using RGBD information, vision sensors have drawbacks such as limited field of view, vulnerability to occlusion and poor performance in low illumination conditions. Wireless sensing, on the other hand, suffers less from appearance variation and can operate non-line-of-sight. But its ranging performance is degraded by multipath and shadowing in complex environments.

Sensor fusion and association in vision-wireless systems create "reality aware network", which allows the strength of each sensing modality to complement each other. In this thesis we propose to fuse wireless communication with visual sensing to improve system sensing range, tracking and localization. We design and implement different sensor fusion and association mechanisms for systems including Vehicle-to-Vehicle, Vehicle-to-Everything as well as localization and pedestrian tracking.

Advanced driver assistance systems benefit from complete understandings of traffic scenes around vehicles. Existing systems gather data through cameras and other sensors in vehicles but scene understanding can be limited due to the sensing range of sensors or occlusion from other objects. To explore how to gather information beyond the view of one vehicle, we propose a connected vehicle system that allows multiple moving vehicles to share perception data over vehicle-to-vehicle communications and collaboratively fuse the data into a more complete traffic scene.

Beyond fusing vision data via wireless communication, associating vision data with wireless data is another fundamental need in multi-modal applications. Successful vision-wireless association enables use-cases such as localization by fusing camera depth measurements with wireless ranging. It can also improve tracking and re-identification since wireless transmitters provide a stable identifier. Existing approaches of visual-wireless data association rely on appearance based fingerprinting, focus on controlled scenarios where participants are always visible and no passersby exist, or formulate optimization problems based on long sequences of measurement that needs post-processing. To achieve robust association between vision and wireless data, we propose a multi-modal system that leverages users' depth measurements, smartphone WiFi Fine Timing Measurements (FTM) and inertial measurement unit (IMU) sensor data to associate users detected on a camera footage with their corresponding smartphone identifiers.

Furthermore, we propose a multi-modal localization approach that leverages pedestrians' visual and phone data to accurately estimate their positions. Existing works of localization adopt filtering techniques to fuse multi-modal sensor data and produce location estimations. In our context, however, these algorithms become infeasible when a pedestrian's camera measurement is unavailable due to occlusion or camera's limited field of view. To address this limitation, we propose a Generative Adversarial Network that leverages the available data correspondences from vision and phone modalities to learn the underlined cross-modal linkage. With a pedestrian's phone measurements as input, the network is able to generate coordinate estimations that are more accurate than the phone's original GPS readings. We further show that the proposed model supports self-learning. The generated coordinates can be associated with pedestrian's bounding box coordinates to obtain additional camera-phone data correspondences during inference.

NotePh.D.

NoteIncludes bibliographical references

Genretheses

Persistent URLhttps://doi.org/doi:10.7282/t3-7f4s-v331

LanguageEnglish

CollectionSchool of Graduate Studies Electronic Theses and Dissertations

Organization NameRutgers, The State University of New Jersey

RightsThe author owns the copyright to this work.

Version 8.5.5

Citation & ExportHide

Simple citation

Export

StatisticsHide

Description

Citation & Export
Hide

Statistics
Hide