Staff View
From millions to one

Descriptive

TypeOfResource
Text
TitleInfo (ID = T-1)
Title
From millions to one
SubTitle
theoretical and concrete approaches to De Novo assembly using short read DNA sequences
Identifier
ETD_2907
Identifier (type = hdl)
http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000056766
Language
LanguageTerm (authority = ISO639-2); (type = code)
eng
Genre (authority = marcgt)
theses
Subject (ID = SBJ-1); (authority = RUETD)
Topic
Computational Biology and Molecular Biophysics
Subject (ID = SBJ-2); (authority = ETD-LCSH)
Topic
DNA--Research--Technique
Abstract (type = abstract)
One of the most significant advances in biology has been the ability to sequence the DNA of organisms. Even in the shadow of the completion of the human genome, intractable regions of the genome remain incomplete. Next generation high-throughput short read sequencing technologies are now available and have the ability to generate millions of short read DNA sequences per run. Although greater coverage depths are possible, de novo sequence assembly with these shorter sequences is significantly more complex than resequencing; handling them presents new computational problems and opportunities. Identifying repetitive regions, coping with sequencing errors, and manipulating the millions of short reads simultaneously, are some of the difficulties that must be overcome. As a result of these complexities and working with the short read sequences from the Waksman SOLiD sequencing platform, this work explores the problem of de novo assembly. Initially, we develop tools for filtering short read sequence data based on quality scores and find that this procedure is critical for the success of the subsequent de novo assembly. Next, we analyze the key phenomena responsible for producing contigs that are much shorter than the values provided by theoretical estimates. Finally, we explore two different routes to circumventing the difficulty imposed by short contigs. The first involves utilization of information from multiple orthologous genomes in a comparative assembly. In particular, we developed a pipeline for using the reference genome of a close by relative to improve genome assembly. The second approach uses paired read information to build scaffolds that are two orders of magnitude larger than the original contigs. For typical bacterial genomes, less than one hundred of these scaffolds are required to cover the entire genome. The combination of short reads from various platforms, assembly, and recovery pipelines brings mid-sized genomes close to completion. As a result, minimal additional work using conventional sequencing technologies are enough to close the remaining small gaps and return a finished single genome. Current advancements in sequencing technologies leave us hopeful that it would be possible to provide fairly complete assemblies for complex genomes via these technological approaches.
PhysicalDescription
Form (authority = gmd)
electronic resource
Extent
x, 112 p. : ill.
InternetMediaType
application/pdf
InternetMediaType
text/xml
Note (type = degree)
Ph.D.
Note (type = bibliography)
Includes bibliographical references
Note (type = vita)
Includes vita
Note (type = statement of responsibility)
by Ariella Syma Sasson
Name (ID = NAME-1); (type = personal)
NamePart (type = family)
Sasson
NamePart (type = given)
Ariella Syma
NamePart (type = date)
1978-
Role
RoleTerm (authority = RULIB)
author
DisplayForm
Ariella Sasson
Name (ID = NAME-2); (type = personal)
NamePart (type = family)
Sengupta
NamePart (type = given)
Anirvan
Role
RoleTerm (authority = RULIB)
chair
Affiliation
Advisory Committee
DisplayForm
Anirvan Sengupta
Name (ID = NAME-3); (type = personal)
NamePart (type = family)
Chen
NamePart (type = given)
Kevin
Role
RoleTerm (authority = RULIB)
internal member
Affiliation
Advisory Committee
DisplayForm
Kevin Chen
Name (ID = NAME-4); (type = personal)
NamePart (type = family)
Schliep
NamePart (type = given)
Alexander
Role
RoleTerm (authority = RULIB)
internal member
Affiliation
Advisory Committee
DisplayForm
Alexander Schliep
Name (ID = NAME-5); (type = personal)
NamePart (type = family)
Bhanot
NamePart (type = given)
Gyan
Role
RoleTerm (authority = RULIB)
internal member
Affiliation
Advisory Committee
DisplayForm
Gyan Bhanot
Name (ID = NAME-6); (type = personal)
NamePart (type = family)
Sidote
NamePart (type = given)
David
Role
RoleTerm (authority = RULIB)
outside member
Affiliation
Advisory Committee
DisplayForm
David Sidote
Name (ID = NAME-1); (type = corporate)
NamePart
Rutgers University
Role
RoleTerm (authority = RULIB)
degree grantor
Name (ID = NAME-2); (type = corporate)
NamePart
Graduate School - New Brunswick
Role
RoleTerm (authority = RULIB)
school
OriginInfo
DateCreated (qualifier = exact)
2010
DateOther (qualifier = exact); (type = degree)
2010-10
Place
PlaceTerm (type = code)
xx
RelatedItem (type = host)
TitleInfo
Title
Rutgers University Electronic Theses and Dissertations
Identifier (type = RULIB)
ETD
RelatedItem (type = host)
TitleInfo
Title
Graduate School - New Brunswick Electronic Theses and Dissertations
Identifier (type = local)
rucore19991600001
Location
PhysicalLocation (authority = marcorg); (displayLabel = Rutgers, The State University of New Jersey)
NjNbRU
Identifier (type = doi)
doi:10.7282/T3H70FJQ
Genre (authority = ExL-Esploro)
ETD doctoral
Back to the top

Rights

RightsDeclaration (AUTHORITY = GS); (ID = rulibRdec0006)
The author owns the copyright to this work.
Copyright
Status
Copyright protected
Availability
Status
Open
Reason
Permission or license
RightsHolder (ID = PRH-1); (type = personal)
Name
FamilyName
Sasson
GivenName
Ariella
Role
Copyright Holder
RightsEvent (ID = RE-1); (AUTHORITY = rulib)
Type
Permission or license
DateTime
2010-09-27 01:09:55
AssociatedEntity (ID = AE-1); (AUTHORITY = rulib)
Role
Copyright holder
Name
Ariella Sasson
Affiliation
Rutgers University. Graduate School - New Brunswick
AssociatedObject (ID = AO-1); (AUTHORITY = rulib)
Type
License
Name
Author Agreement License
Detail
I hereby grant to the Rutgers University Libraries and to my school the non-exclusive right to archive, reproduce and distribute my thesis or dissertation, in whole or in part, and/or my abstract, in whole or in part, in and from an electronic format, subject to the release date subsequently stipulated in this submittal form and approved by my school. I represent and stipulate that the thesis or dissertation and its abstract are my original work, that they do not infringe or violate any rights of others, and that I make these grants as the sole owner of the rights to my thesis or dissertation and its abstract. I represent that I have obtained written permissions, when necessary, from the owner(s) of each third party copyrighted matter to be included in my thesis or dissertation and will supply copies of such upon request by my school. I acknowledge that RU ETD and my school will not distribute my thesis or dissertation or its abstract if, in their reasonable judgment, they believe all such rights have not been secured. I acknowledge that I retain ownership rights to the copyright of my work. I also retain the right to use all or part of this thesis or dissertation in future works, such as articles or books.
Back to the top

Technical

ContentModel
ETD
MimeType (TYPE = file)
application/pdf
MimeType (TYPE = container)
application/x-tar
FileSize (UNIT = bytes)
6092800
Checksum (METHOD = SHA1)
bdcf1539cfb71950751250c28f619f568be958f4
Back to the top
Version 8.5.5
Rutgers University Libraries - Copyright ©2024