Pai, Dinesh R.. Determinig the efficacy of mathematical programming approaches for multi-group classification. Retrieved from https://doi.org/doi:10.7282/T3PR7W60
DescriptionManagers have been grappling with the problem of extracting patterns out of the vast database generated by their systems. The advent of powerful information systems in organizations and the consequent agglomeration of vast pool of data since the mid-1980s have created renewed interest in the usefulness of discriminant analysis (DA). Expert systems have come to the aid of managers in their day-to-day decision making with many successful applications in financial planning, sales management, and other areas of business operations (Erenguc and Koehler 1990).
Currently, no comprehensive research study exists that tests the robustness of multi-group classification analysis. Our research aims to bridge the gaps in the existing works and take a step further by extending our study to four-group classification problems. The main purpose of this research is to determine the efficacy of mathematical programming classification models, more specifically, LP methods vis-à-vis statistical approaches such as discriminant analysis (Mahalanobis) and logistic regression, an artificial intelligence (AI) technique such as a neural network, and a non-parametric technique such as k-nearest neighborhood (k-NN) for four-group classification problems. This research also proposes an integrated (hybrid) model that combines a non-parametric classification technique and a LP approach to enhance the overall classification performance. Furthermore, the study extends an existing two-group LP model (Bal et al. 2006) based on the work of (Lam and Moy 1996b) and apply it to four-group classification problems. These models are tested through robust computational experiments under varying data conditions using a financial product example. The characteristics of a real dataset are used to simulate (Monte Carlo method) multiple sample runs for four group classification problems with three continuous independent variables.
The experimental results show that LP approaches in general and the proposed integrated method in particular consistently have lower misclassification rates for most data characteristics. Furthermore, the integrated method utilizes the strengths of both the methods: k-NN and linear programming, thereby considerably improving the classification accuracy.