A unifying framework for computational reinforcement learning theory

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

Staff View

A unifying framework for computational reinforcement learning theory

Descriptive Metadata

Rights Metadata

Technical Metadata

Descriptive

TypeOfResource

Text

TitleInfo (ID = T-1)

Title

A unifying framework for computational reinforcement learning theory

SubTitle

PartName

PartNumber

NonSort

Identifier (displayLabel = ); (invalid = )

ETD_1766

Identifier (type = hdl)

http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.000051858

Language (objectPart = )

LanguageTerm (authority = ISO639-2); (type = code)

eng

Genre (authority = marcgt)

theses

Subject (ID = SBJ-1); (authority = RUETD)

Topic

Computer Science

Subject (ID = SBJ-2); (authority = ETD-LCSH)

Topic

Computational learning theory

Subject (ID = SBJ-3); (authority = ETD-LCSH)

Topic

Reinforcement learning

Subject (ID = SBJ-4); (authority = ETD-LCSH)

Topic

Machine learning

Abstract

Computational learning theory studies mathematical models that allow one to formally analyze and compare the performance of supervised-learning algorithms such as their sample complexity. While existing models such as PAC (Probably Approximately Correct) have played an influential role in understanding the nature of supervised learning, they have not been as successful in reinforcement learning (RL). Here, the fundamental barrier is the need for active exploration in sequential decision problems.
An RL agent tries to maximize long-term utility by exploiting its knowledge about the problem, but this knowledge has to be acquired by the agent itself through exploring the problem that may reduce short-term utility. The need for active exploration is common in many problems in daily life, engineering, and sciences. For example, a Backgammon program strives to take good moves to maximize the probability of winning a game, but sometimes it may try novel and possibly harmful moves to discover how the opponent reacts in the hope of discovering a better game-playing strategy. It has been known since the early days of RL that a good tradeoff between exploration and exploitation is critical for the agent to learn fast (i.e., to reach near-optimal strategies with a small sample complexity), but a general theoretical analysis of this tradeoff remained open until recently.
In this dissertation, we introduce a novel computational learning model called KWIK (Knows What It Knows) that is designed particularly for its utility in analyzing learning problems like RL where active exploration can impact the training data the learner is exposed to. My thesis is that the KWIK learning model provides a flexible, modularized, and unifying way for creating and analyzing reinforcement-learning algorithms with provably efficient exploration. In particular, we show how the KWIK perspective can be used to unify the analysis of existing RL algorithms with polynomial sample complexity. It also facilitates the development of new algorithms with smaller sample complexity, which have demonstrated empirically faster learning speed in real-world problems. Furthermore, we provide an improved, matching sample complexity lower bound, which suggests the optimality (in a sense) of one of the KWIK-based algorithms known as delayed Q-learning.

PhysicalDescription

Form (authority = gmd)

electronic resource

Extent

xvii, 264 p. : ill.

InternetMediaType

application/pdf

InternetMediaType

text/xml

Note (type = degree)

Ph.D.

Note (type = bibliography)

Includes bibliographical references (p. 238-261)

Note (type = statement of responsibility)

by Lihong Li

Name (ID = NAME-1); (type = personal)

NamePart (type = family)

NamePart (type = given)

Lihong

NamePart (type = date)

1979-

Role

RoleTerm (authority = RULIB); (type = )

author

DisplayForm

Lihong Li

Name (ID = NAME-2); (type = personal)

NamePart (type = family)

Littman

NamePart (type = given)

Michael

Role

RoleTerm (authority = RULIB); (type = )

chair

Affiliation

Advisory Committee

DisplayForm

Michael L. Littman

Name (ID = NAME-3); (type = personal)

NamePart (type = family)

Pazzani

NamePart (type = given)

Michael

Role

RoleTerm (authority = RULIB); (type = )

internal member

Affiliation

Advisory Committee

DisplayForm

Michael J. Pazzani

Name (ID = NAME-4); (type = personal)

NamePart (type = family)

Szegedy

NamePart (type = given)

Mario

Role

RoleTerm (authority = RULIB); (type = )

internal member

Affiliation

Advisory Committee

DisplayForm

Mario Szegedy

Name (ID = NAME-5); (type = personal)

NamePart (type = family)

Schapire

NamePart (type = given)

Robert

Role

RoleTerm (authority = RULIB); (type = )

outside member

Affiliation

Advisory Committee

DisplayForm

Robert E. Schapire

Name (ID = NAME-1); (type = corporate)

NamePart

Rutgers University

Role

RoleTerm (authority = RULIB); (type = )

degree grantor

Name (ID = NAME-2); (type = corporate)

NamePart

Graduate School - New Brunswick

Role

RoleTerm (authority = RULIB); (type = )

school

OriginInfo

DateCreated (point = ); (qualifier = exact)

2009

DateOther (qualifier = exact); (type = degree)

2009-10

Place

PlaceTerm (type = code)

RelatedItem (type = host)

TitleInfo

Title

Rutgers University Electronic Theses and Dissertations

Identifier (type = RULIB)

ETD

RelatedItem (type = host)

TitleInfo

Title

Graduate School - New Brunswick Electronic Theses and Dissertations

Identifier (type = local)

rucore19991600001

Location

PhysicalLocation (authority = marcorg); (displayLabel = Rutgers, The State University of New Jersey)

NjNbRU

Identifier (type = doi)

doi:10.7282/T3S46S46

Genre (authority = ExL-Esploro)

ETD doctoral

Back to the top

Rights

RightsDeclaration (AUTHORITY = GS); (ID = rulibRdec0006)

The author owns the copyright to this work

Status

Notice

Note

Availability

Status

Open

Reason

Permission or license

Note

RightsHolder (ID = PRH-1); (type = personal)

Name

FamilyName

GivenName

Lihong

Role

RightsEvent (ID = RE-1); (AUTHORITY = rulib)

Type

Permission or license

Label

Place

DateTime

Detail

AssociatedEntity (ID = AE-1); (AUTHORITY = rulib)

Role

Name

Lihong Li

Affiliation

Rutgers University. Graduate School - New Brunswick

AssociatedObject (ID = AO-1); (AUTHORITY = rulib)

Type

License

Name

Author Agreement License

Detail

I hereby grant to the Rutgers University Libraries and to my school the non-exclusive right to archive, reproduce and distribute my thesis or dissertation, in whole or in part, and/or my abstract, in whole or in part, in and from an electronic format, subject to the release date subsequently stipulated in this submittal form and approved by my school. I represent and stipulate that the thesis or dissertation and its abstract are my original work, that they do not infringe or violate any rights of others, and that I make these grants as the sole owner of the rights to my thesis or dissertation and its abstract. I represent that I have obtained written permissions, when necessary, from the owner(s) of each third party copyrighted matter to be included in my thesis or dissertation and will supply copies of such upon request by my school. I acknowledge that RU ETD and my school will not distribute my thesis or dissertation or its abstract if, in their reasonable judgment, they believe all such rights have not been secured. I acknowledge that I retain ownership rights to the copyright of my work. I also retain the right to use all or part of this thesis or dissertation in future works, such as articles or books.

Back to the top

Technical

ContentModel

ETD

MimeType (TYPE = file)

application/pdf

MimeType (TYPE = container)

application/x-tar

FileSize (UNIT = bytes)

2826240

Checksum (METHOD = SHA1)

ec6186a968f05092840fa87acce903aaef59ecd6

Back to the top

Version 8.5.5