TY - JOUR TI - Essays on accounting data differences and audit learning DO - https://doi.org/doi:10.7282/T3H70D21 PY - 2014 AB - The dissertation comprises of three essays that 1) compare accounting numbers in Capital IQ's Compustat North America Fundamentals Annual, the most popular accounting database in accounting research, to the original numbers in corporate reports, 2) study the effects of Compustat's data standardization procedures on accounting-based bankruptcy prediction models, and 3) develop a framework to enhance the performance of analytical learning models in a multi-period auditing setting. In the first essay, we conduct the first large-scale comparison of Compustat and 10-K data. Specifically, we compare 30 accounting line items of approximately 5,000 companies for the period from October 1, 2011, to September 30, 2012. We find that the values reported in Compustat significantly differ from the values reported in 10-K filings. We also find that the amount and magnitude of the original data alterations introduced by Compustat depend on the type of the accounting item and company characteristics such as industry and size. Numbers that appear in Compustat are standardized -- adjusted to fit fixed variable definitions -- to ensure "...consistent and comparable data across companies, industries and business cycles..." However, there has been no evidence in the academic literature that Compustat's standardized numbers provide more benefits than the original numbers in financial statements. In the second essay, we examine the effects of Compustat's data standardization using Altman's 1968 and Ohlson's 1980 bankruptcy prediction models as examples. We find that Compustat's data standardization not only yields no improvements for bankruptcy prediction models, but also has a significant negative impact on the predictive accuracy of Altman's model (up to 8.56%) There are several challenges in applying analytical models to the auditing problem of identifying irregular transactions. We argue that because of these challenges standard statistical models may not be well-suited for auditing and have to be modified to achieve better performance. In the third essay, we propose a framework to boost the performance of analytical learning models in auditing. The results of framework's testing on the real data show a significant increase of performance of the tested models. KW - Management KW - Compustat information retrieval programs KW - Accounting KW - Auditing--Data processing LA - eng ER -