TY - JOUR
TI - The rectangular maximum agreement problem: applications and parallel solution
DO - https://doi.org/doi:10.7282/t3-g7x8-he22
PY - 2018
AB - A NP-hard rectangular maximum agreement (RMA) problem finds a “box” that best discriminates between two weighted datasets. We respectively describe a specialized parallel branch-and-bound method and a greedy heuristic to solve RMA exactly or approximately. Our computational results show that a new parallel branch-and-bound method can solve larger RMA problems exactly in less time than a previously implemented parallel branch-and-bound procedure and Gurobi, a mixed integer programming (MIP) solver.
We describe two applications of RMA: LPBoost-based two-class classification and a rule-enhanced penalized regression. The first classification application constructs a stronger classifier from a set of weighted voting classifiers by maximizing the margin between the two observation classes
and penalizing classification errors. The weighted voting classifiers are multidimensional ``box''-based rules. The second regression application formulates a L1-penalized regression model using multidimensional “box”-based rules as additional explanatory variables.
Due to exponentially large number of possible multidimensional rules, they are dynamically generated in both applications. In contrast to prior approaches to solve these problems, we draw heavily on
standard (but non-polynomial-time) mathematical programming techniques, enhanced by parallel computing. Our rule-adding procedure is based on the classical column generation method for high-dimensional linear programming. The pricing problem for our column generation procedure reduces to the RMA problem, and it is solved exactly or approximately. This method resembles boosting in machine learning. Furthermore, we propose a discretization method before solving RMA. It reduces the level of difficulty for RMA while still maintaining prediction or classification accuracy in the applications. The prediction accuracy of our models are tested by cross-validation. The resulting classification and regression methods are computation-intensive, but our computational tests suggest that they outperform prior methods at making accurate and stable predictions.
KW - Branch and bound algorithms
KW - Operations Research
LA - English
ER -