Ramesh, Prathiksha. Prediction of cost overruns using ensemble methods in data mining and text mining algorithms. Retrieved from https://doi.org/doi:10.7282/T3WW7FR1
DescriptionIn competitive bidding in the United States, the lowest bid is most often than not selected to perform the project. However, the lowest bidder tends to undervalue the costs in order to win the bid and as a result may incur significant cost increases during the construction life cycle due to change orders. For project owners to accurately estimate the actual project cost and to predict the bid that is close to the actual project, there is an urgent need for new decision aids to analyze the bidding patterns. The goal of this research has been to select the predictive features in a bid package to help minimize the cost overruns with the help of open source data mining software. The features were selected based on correlation and regression analysis by studying the p-values and r-squared values. The data set was then prepared with only the features that were affecting the output, which in our case were the cost overruns. The output is divided into 4 classes depending on the percentage of overrun. The learning algorithms used for prediction were neural networks, support vector machines, decision trees along with the ensemble methods. The empirical study of the prediction models suggest an efficiency of up to 50% in predicting whether a project will have cost overruns and what is the approximate range of percentage overrun.