TY - JOUR TI - Insights Into evolution and adaptation using computational methods and next generation sequencing DO - https://doi.org/doi:10.7282/T3M61NDC PY - 2016 AB - Historically, much of the research in evolutionary biology and population genetics has involved analysis at the level of either a single locus or a few number thereof. However, Next Generation sequencing technology has opened the floodgates with respect to both the sheer volume and quality of sequence data that researchers have long needed to address and answer long-standing questions in their fields. Scientists are now, by and large, no longer hampered in their efforts by technological hurdles to obtain data, but are in fact facing the problem of how best to use the vast amount of data that are accumulating at an ever-increasing rate. This is a good problem to have. The following research described in this dissertation is an attempt to derive answers to questions in the fields of population genetics and evolutionary biology that, until recently, have been either intractable or, at best, extremely difficult to address. In the first chapter I provide an introduction and a brief historical look at the research efforts that have proceeded my own. In the second chapter I describe how modern sequencing methods and computational analysis can be used to study, analyze, and answer evolutionary questions about the non-model organism, Enallagma hageni, in order to 1) determine this organism's phylogenetic position within Arthropoda, 2) provide answers and insight into the evolutionary history of the protein-encoding genes in the Enallagma transcriptome, and 3) give functional annotation to these expressed proteins. In the third chapter I examine how natural selection acts on the genome and derive a method that can accurately determine the evolutionary cause of nucleotide fixations, having occurred either through positive selection or neutral processes. I then apply the methodology to North American populations of Drosophila melanogaster, providing further evidence as to how adaptive evolution proceeds in a newly established population. This is an important question, for though there have been multiple approaches devised to determine the targets and modes of evolution in the genome, to date there has not emerged a definitive method which can determine both the location and type of a selective process, and as a result, the picture of how and where adaptive evolution proceeds in the genome has remained opaque. In the forth chapter I examine how levels of natural selection within the genome have the potential to inhibit the ability to accurately learn population demographic history. Using a number of modern algorithms and extensive simulations, I first examine whether or not demographic histories that are learned under simple biological assumptions will yield accurate results when the actual data itself does not adhere to these assumptions. Further, I go on to examine more complicated models of demographic history, looking specifically at how positive selection biases inference, which directions these biases occur, and at what levels of selection do inference methods fail to be robust. Finally, I describe potential evolutionary scenarios where these inference methods may be more prone to fail, as well as methods which might mitigate positive selection's effects, thus allowing for more accurate histories to be inferred. The work contained in this dissertation, at the broadest scale, is an effort to marry state-of-the-art techniques in statistics, computer science, and machine learning algorithms to the technological advances of next generation sequencing; the potent combination of these technologies has provided a means with which to derive answers to multiple, long-standing questions in population genetics and evolutionary biology. KW - Computational Biology and Molecular Biophysics KW - Evolution (Biology)--Mathematical models KW - Genomes--Analysis LA - eng ER -