TY - JOUR TI - Computational methods for the interpretation of forensic DNA samples DO - https://doi.org/doi:10.7282/T3PK0J34 PY - 2015 AB - Interpretation of DNA profiles generated from STRs can be problematic because of dropout, allele overlap and artifacts like stutter. The goal of this research is to develop computational methods for the analysis of STR profiles that are robust to these phenomena and that utilize quantitative peak height information captured in profiles. These methods are expected to improve significantly on existing methods for analysis of STR profiles, particularly in cases of low amounts of template DNA or where there are many contributors. In the first part of our research, we characterized the distribution of signal, noise and stutter peak heights and studied their dependence on template DNA amount. For the second part of our project, we developed a method to identify the number of contributors to a DNA sample. Our method, NOCIt, calculates the a posteriori probability on the number of contributors to a forensic sample taking into account signal peak heights, population allele frequencies, baseline noise, allele dropout and stutter. On the experimental samples tested, NOCIt had an accuracy of 83%, while the accuracy of the best pre-existing method was 72%. The accuracies of NOCIt and the best pre-existing method on the simulated profiles were 85% and 73%, respectively. We were able to reduce the running time of NOCIt by developing a faster method based on an importance sampling algorithm. In the third and final part of our research, we developed a computational tool (MatchIt) to directly compute a continuous Likelihood Ratio (LR) for a person of interest (POI), treating other contributors (if any) as interference. MatchIt also calculates the distribution of the LR along with the p-value, which is the probability a randomly chosen individual results in a LR at least as large as the LR obtained from the POI. We observed that the amount of template DNA from the contributor impacted the LR – small LRs resulted from contributors with low template masses. Moreover, we observed a decrease of p-values as the LR increased. A p-value of 10-9, the lowest possible in our testing, was achieved in all the cases where the LR was greater than 108. KW - Computational and Integrative Biology KW - Forensic genetics KW - DNA--Analysis LA - eng ER -