A machine learning-driven decision support system for startup investments using entropy-based adaptive loss with sample weighting
Abstract
Purpose This study presents a new startup investment decision support model (SIDSM) that is specifically designed to mitigate the inherent challenges of class imbalance and uncertainty in startup investment decisions. Design/methodology/approach Based on financial and structural indicators sourced from the Crunchbase database, the proposed model incorporates a multi-stage methodology. First, a systematic feature selection process integrating SHAP, Boruta and Elbow methods is used to retain informative features. Subsequently, uncertainty estimates are calculated at the feature and observation levels using the Shannon-Entropy and DeepGini metric and included in XGBoost's learning process through a user-defined loss function and a label-distribution-aware margin (LDAM) integration. Cuckoo Search meta-heuristic algorithm is used to optimize the hyperparameters to ensure model robustness, and a class-based threshold optimization is used to optimize decision boundaries. Findings The experimental findings demonstrate that SIDSM outperforms the baseline models, achieving a macro F1-score of 89.47% and more stable minority class detection, thereby indicating its potential to support startup investment decisions in a reliable, transparent and evidence-driven manner under class imbalance. Originality/value This study proposes a novel, data-conscious and holistic approach to the methodological limitations of traditional classification approaches. By integrating feature explanatory power and observation-based uncertainties into the learning process, rather than relying solely on error-driven optimization, the model becomes sensitive to uncertainty, captures difficult-to-learn and minority-class patterns better and provides a highly robust and explainable framework for startup investment decision-making.