Staff View
Adaptive multimodal integration of speech and gaze

Descriptive

TypeOfResource
Text
TitleInfo (ID = T-1)
Title
Adaptive multimodal integration of speech and gaze
SubTitle
PartName
PartNumber
NonSort
Identifier (displayLabel = ); (invalid = )
ETD_1979
Identifier (type = hdl)
http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.000051872
Language (objectPart = )
LanguageTerm (authority = ISO639-2); (type = code)
eng
Genre (authority = marcgt)
theses
Subject (ID = SBJ-1); (authority = RUETD)
Topic
Electrical and Computer Engineering
Subject (ID = SBJ-2); (authority = ETD-LCSH)
Topic
Human-computer interaction
Subject (ID = SBJ-3); (authority = ETD-LCSH)
Topic
User interfaces (Computer systems)
Abstract
Speech has been used as the foundation for many human/machine interactive systems to convey the user’s intent to the system. However, other input mechanisms, commonly called modalities, such as gaze, touch, and hand gestures, have been explored as a means of providing a more robust interaction in environments where speech alone is not adequate. By combining the inputs from multiple, complementary modalities, none of which is perfectly reliable, a better understanding of the user’s true intent can be imparted to the system. In this dissertation, the effectiveness of using gaze (where someone is looking) to aid speech in providing the user’s intent to the machine is explored. To create a speech and gaze integration model, two human factors experiments were conducted to collect data for building this model. The first experiment had the user read a single word displayed on a screen, and the second experiment required the user to read a designated word from a menu of words. Speech onset time and the user’s gaze patterns data were captured
and analyzed to understand the timing relations between the two modalities. A set of gaze/speech
features were extracted from the data and used to predict the location of the word that the user read. The best features and the best model for predicting the location of the target word were found through an iterative trial and error process. A linear model was able to predict the gaze location of the target as well as any of the non-linear models considered. The linear system
representation was then used to create an adaptive model using the Row Action Projection (RAP) technique. The RAP adaptation model was found to predict the user’s intent with higher probability for the majority subjects than the non-adaptive approaches. The RAP model adapted to the speech/gaze patterns of each individual user as well as the variation in a single user’s interaction behavior over time. It was also found that the feature set used for successfully identifying the target in Experiment 1, a simple isolated word task, was different than that used in Experiment 2, a more complex menu selection task, suggesting that task complexity was an important consideration in the design of a speech/gaze interface. In summary, this dissertation has shown that an adaptive gaze and speech integration model is better than speech or gaze
performance alone.
PhysicalDescription
Form (authority = gmd)
electronic resource
Extent
xiii, 167 p. : ill.
InternetMediaType
application/pdf
InternetMediaType
text/xml
Note (type = degree)
Ph.D.
Note (type = bibliography)
Includes bibliographical references (p. 155-166)
Note (type = statement of responsibility)
by Chandra Sekhar Mantravadi
Name (ID = NAME-1); (type = personal)
NamePart (type = family)
Mantravadi
NamePart (type = given)
Chandra Sekhar
NamePart (type = date)
1972-
Role
RoleTerm (authority = RULIB); (type = )
author
DisplayForm
Chandra Sekhar Mantravadi
Name (ID = NAME-2); (type = personal)
NamePart (type = family)
Wilder
NamePart (type = given)
Joseph
Role
RoleTerm (authority = RULIB); (type = )
chair
Affiliation
Advisory Committee
DisplayForm
Joseph Wilder
Name (ID = NAME-3); (type = personal)
NamePart (type = family)
Tremaine
NamePart (type = given)
Marilyn
Role
RoleTerm (authority = RULIB); (type = )
co-chair
Affiliation
Advisory Committee
DisplayForm
Marilyn Tremaine
Name (ID = NAME-4); (type = personal)
NamePart (type = family)
Mammone
NamePart (type = given)
Richard
Role
RoleTerm (authority = RULIB); (type = )
internal member
Affiliation
Advisory Committee
DisplayForm
Richard Mammone
Name (ID = NAME-5); (type = personal)
NamePart (type = family)
Rabiner
NamePart (type = given)
Lawrence
Role
RoleTerm (authority = RULIB); (type = )
internal member
Affiliation
Advisory Committee
DisplayForm
Lawrence Rabiner
Name (ID = NAME-6); (type = personal)
NamePart (type = family)
Marsic
NamePart (type = given)
Ivan
Role
RoleTerm (authority = RULIB); (type = )
internal member
Affiliation
Advisory Committee
DisplayForm
Ivan Marsic
Name (ID = NAME-7); (type = personal)
NamePart (type = family)
Gwizdka
NamePart (type = given)
Jacek
Role
RoleTerm (authority = RULIB); (type = )
outside member
Affiliation
Advisory Committee
DisplayForm
Jacek Gwizdka
Name (ID = NAME-1); (type = corporate)
NamePart
Rutgers University
Role
RoleTerm (authority = RULIB); (type = )
degree grantor
Name (ID = NAME-2); (type = corporate)
NamePart
Graduate School - New Brunswick
Role
RoleTerm (authority = RULIB); (type = )
school
OriginInfo
DateCreated (point = ); (qualifier = exact)
2009
DateOther (qualifier = exact); (type = degree)
2009-10
Place
PlaceTerm (type = code)
xx
RelatedItem (type = host)
TitleInfo
Title
Rutgers University Electronic Theses and Dissertations
Identifier (type = RULIB)
ETD
RelatedItem (type = host)
TitleInfo
Title
Graduate School - New Brunswick Electronic Theses and Dissertations
Identifier (type = local)
rucore19991600001
Location
PhysicalLocation (authority = marcorg); (displayLabel = Rutgers, The State University of New Jersey)
NjNbRU
Identifier (type = doi)
doi:10.7282/T3QC03PM
Genre (authority = ExL-Esploro)
ETD doctoral
Back to the top

Rights

RightsDeclaration (AUTHORITY = GS); (ID = rulibRdec0006)
The author owns the copyright to this work
Copyright
Status
Copyright protected
Notice
Note
Availability
Status
Open
Reason
Permission or license
Note
RightsHolder (ID = PRH-1); (type = personal)
Name
FamilyName
Mantravadi
GivenName
Chandra
Role
Copyright holder
RightsEvent (ID = RE-1); (AUTHORITY = rulib)
Type
Permission or license
Label
Place
DateTime
Detail
AssociatedEntity (ID = AE-1); (AUTHORITY = rulib)
Role
Copyright holder
Name
Chandra Mantravadi
Affiliation
Rutgers University. Graduate School - New Brunswick
AssociatedObject (ID = AO-1); (AUTHORITY = rulib)
Type
License
Name
Author Agreement License
Detail
I hereby grant to the Rutgers University Libraries and to my school the non-exclusive right to archive, reproduce and distribute my thesis or dissertation, in whole or in part, and/or my abstract, in whole or in part, in and from an electronic format, subject to the release date subsequently stipulated in this submittal form and approved by my school. I represent and stipulate that the thesis or dissertation and its abstract are my original work, that they do not infringe or violate any rights of others, and that I make these grants as the sole owner of the rights to my thesis or dissertation and its abstract. I represent that I have obtained written permissions, when necessary, from the owner(s) of each third party copyrighted matter to be included in my thesis or dissertation and will supply copies of such upon request by my school. I acknowledge that RU ETD and my school will not distribute my thesis or dissertation or its abstract if, in their reasonable judgment, they believe all such rights have not been secured. I acknowledge that I retain ownership rights to the copyright of my work. I also retain the right to use all or part of this thesis or dissertation in future works, such as articles or books.
Back to the top

Technical

ContentModel
ETD
MimeType (TYPE = file)
application/pdf
MimeType (TYPE = container)
application/x-tar
FileSize (UNIT = bytes)
2365440
Checksum (METHOD = SHA1)
ee65c21f3bb4ecd2ed2bc1970d269600e3c76e9d
Back to the top
Version 8.5.5
Rutgers University Libraries - Copyright ©2024