Sparse and low-rank representation-based methods for multimodal clustering and recognition

Abavisani, Mahdi

doi:doi:10.7282/t3-ysvh-8d53

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

Sparse and low-rank representation-based methods for multimodal clustering and recognition

PDF

PDF format is widely accepted and good for printing.

Plug-in required

PDF-1(26.96 MB)

Citation & Export

View Usage Statistics

Staff View

Citation & Export
Hide

Simple citation

Abavisani, Mahdi. Sparse and low-rank representation-based methods for multimodal clustering and recognition. Retrieved from https://doi.org/doi:10.7282/t3-ysvh-8d53

Export

Click here for information about Citation Management Tools at Rutgers.

Statistics
Hide

Description

TitleSparse and low-rank representation-based methods for multimodal clustering and recognition

NameAbavisani, Mahdi (author); Patel, Vishal M. (chair); Najafizadeh, Laleh (internal member); Gajic, Zoran (internal member); Metaxas, Dimitris N. (outside member); Rutgers University; School of Graduate Studies

Date Created2021

Other Date2021-01 (degree)

SubjectMultimodal learning, Electrical and Computer Engineering

Extent1 online resource (xix, 47 pages)

DescriptionRecent advances in technology have provided massive amounts of complex high-dimensional and multimodal data for computer vision and machine learning applications. This thesis uses sparse and low-rank representation-based techniques to introduce several approaches for leveraging the complementary information from multimodal and high-dimensional data in clustering and recognition tasks. We start with a focus on subspace clustering algorithms. We extend the popular sparse and low-rank based subspace clustering methods to multimodal subspace clustering algorithms that can integrate multiple high-dimensional modalities and represent them in low-dimensional joint subspaces. We then use convolutional neural networks (CNNs) to improve our proposed multimodal subspace clustering methods and develop deep multimodal subspace clustering networks. Furthermore, we design a framework for incorporating data augmentation techniques in subspace clustering networks. In the second part of the thesis, we focus on developing multimodal classification approaches. We start with introducing deep sparse representation-based classification (DSRC) and extending it to its multimodal version. Then, we propose novel approaches for two real-world applications with high-dimensional and multimodal data. In particular, first, we introduce a method to leverage the knowledge of multiple video streams in dynamic hand gesture recognition tasks and embed the knowledge in every single unimodal network. As a result, we improve the accuracy of unimodal networks at the test time while they remain to perform in real-time. Our second applied approach is a fusion method for combining the information in social media posts' texts and images. Both texts and images are considered high-dimensional data, and in the case of social media posts, they can sometimes be uninformative or even misleading. We presented a method that is able to filter uninformative parts of text-image pairs and leverage their complementary information to detect crisis events in social media posts. Finally, we discuss some possible future research directions.

NotePh.D.

NoteIncludes bibliographical references

Genretheses, External ETD doctoral

Persistent URLhttps://doi.org/doi:10.7282/t3-ysvh-8d53

LanguageEnglish

CollectionSchool of Graduate Studies Electronic Theses and Dissertations

Organization NameRutgers, The State University of New Jersey

RightsThe author owns the copyright to this work.

Version 8.5.5

Citation & ExportHide

Simple citation

Export

StatisticsHide

Description

Citation & Export
Hide

Statistics
Hide