Probably Approximately Correct (PAC) exploration in reinforcement learning

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

Staff View

Probably Approximately Correct (PAC) exploration in reinforcement learning

Descriptive Metadata

Rights Metadata

Technical Metadata

Descriptive

TitleInfo (displayLabel = Citation Title); (type = uniform)

Title

Probably Approximately Correct (PAC) exploration in reinforcement learning

Name (ID = NAME001); (type = personal)

NamePart (type = family)

Strehl

NamePart (type = given)

Alexander L.

DisplayForm

Alexander L. Strehl

Role

RoleTerm (authority = RUETD)

author

Name (ID = NAME002); (type = personal)

NamePart (type = family)

Littman

NamePart (type = given)

Michael

Affiliation

Advisory Committee

DisplayForm

Michael L Littman

Role

RoleTerm (authority = RULIB)

chair

Name (ID = NAME003); (type = personal)

NamePart (type = family)

Hirsh

NamePart (type = given)

Haym

Affiliation

Advisory Committee

DisplayForm

Haym Hirsh

Role

RoleTerm (authority = RULIB)

internal member

Name (ID = NAME004); (type = personal)

NamePart (type = family)

Szegedy

NamePart (type = given)

Mario

Affiliation

Advisory Committee

DisplayForm

Mario Szegedy

Role

RoleTerm (authority = RULIB)

internal member

Name (ID = NAME005); (type = personal)

NamePart (type = family)

Kearns

NamePart (type = given)

Michael

Affiliation

Advisory Committee

DisplayForm

Michael Kearns

Role

RoleTerm (authority = RULIB)

outside member

Name (ID = NAME006); (type = corporate)

NamePart

Rutgers University

Role

RoleTerm (authority = RULIB)

degree grantor

Name (ID = NAME007); (type = corporate)

NamePart

Graduate School - New Brunswick

Role

RoleTerm (authority = RULIB)

school

TypeOfResource

Text

Genre (authority = marcgt)

theses

OriginInfo

DateCreated (qualifier = exact)

2007

DateOther (qualifier = exact); (type = degree)

2007

Language

LanguageTerm

English

PhysicalDescription

Form (authority = marcform)

electronic

InternetMediaType

application/pdf

InternetMediaType

text/xml

Extent

viii, 137 pages

Abstract

Reinforcement Learning (RL) in finite state and action Markov Decision Processes is studied with an emphasis on the well-studied exploration problem. We provide a general RL framework that applies to all results in this thesis and to other results in RL that generalize the finite MDP assumption. We present two new versions of the Model-Based Interval Estimation (MBIE) algorithm and prove that they are both PAC-MDP. These algorithms are provably more efficient any than previously studied RL algorithms. We prove that many model-based algorithms (including R-MAX and MBIE) can be modified so that their worst-case per-step computational complexity is vastly improved without sacrificing their attractive theoretical guarantees. We show that it is possible to obtain PAC-MDP bounds with a model-free algorithm called Delayed Q-learning.

Note (type = degree)

Ph.D.

Note (type = bibliography)

Includes bibliographical references (p. 133-136).

Subject (ID = SUBJ1); (authority = RUETD)

Topic

Computer Science

Subject (ID = SUBJ2); (authority = ETD-LCSH)

Topic

Learning models (Stochastic processes)

Subject (ID = SUBJ3); (authority = ETD-LCSH)

Topic

Learning--Mathematical models

Subject (ID = SUBJ4); (authority = ETD-LCSH)

Topic

Machine learning

RelatedItem (type = host)

TitleInfo

Title

Graduate School - New Brunswick Electronic Theses and Dissertations

Identifier (type = local)

rucore19991600001

Identifier (type = hdl)

http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.16785

Identifier

ETD_462

Location

PhysicalLocation (authority = marcorg); (displayLabel = Rutgers, The State University of New Jersey)

NjNbRU

Identifier (type = doi)

doi:10.7282/T3Z3202G

Genre (authority = ExL-Esploro)

ETD doctoral

Back to the top

Rights

RightsDeclaration (AUTHORITY = GS); (ID = rulibRdec0006)

The author owns the copyright to this work.

Status

Availability

Status

Open

AssociatedEntity (AUTHORITY = rulib); (ID = 1)

Name

Alexander Strehl

Role

Affiliation

Rutgers University. Graduate School - New Brunswick

RightsEvent (AUTHORITY = rulib); (ID = 1)

Type

Permission or license

Detail

Non-exclusive ETD license

AssociatedObject (AUTHORITY = rulib); (ID = 1)

Type

License

Name

Author Agreement License

Detail

I hereby grant to the Rutgers University Libraries and to my school the non-exclusive right to archive, reproduce and distribute my thesis or dissertation, in whole or in part, and/or my abstract, in whole or in part, in and from an electronic format, subject to the release date subsequently stipulated in this submittal form and approved by my school. I represent and stipulate that the thesis or dissertation and its abstract are my original work, that they do not infringe or violate any rights of others, and that I make these grants as the sole owner of the rights to my thesis or dissertation and its abstract. I represent that I have obtained written permissions, when necessary, from the owner(s) of each third party copyrighted matter to be included in my thesis or dissertation and will supply copies of such upon request by my school. I acknowledge that RU ETD and my school will not distribute my thesis or dissertation or its abstract if, in their reasonable judgment, they believe all such rights have not been secured. I acknowledge that I retain ownership rights to the copyright of my work. I also retain the right to use all or part of this thesis or dissertation in future works, such as articles or books.

Back to the top

Technical

Format (TYPE = mime); (VERSION = )

application/x-tar

FileSize (UNIT = bytes)

638976

Checksum (METHOD = SHA1)

b036fab831c6e98f58884ed770955724e5026b28

ContentModel

ETD

CompressionScheme

other

OperatingSystem (VERSION = 5.1)

windows xp

Format (TYPE = mime); (VERSION = NULL)

application/x-tar

Back to the top

Version 8.5.5