JP Journal of Biostatistics

The JP Journal of Biostatistics is a highly regarded open-access international journal indexed in the Emerging Sources Citation Index (ESCI). It focuses on the application of statistical theory and methods in resolving problems in biological, biomedical, and agricultural sciences. The journal encourages the submission of experimental papers that employ relevant algorithms and also welcomes survey articles in the fields of biostatistics and epidemiology.

Submit Article

HYBRID TREE-ENSEMBLE MODELS INTEGRATING EXTREME VALUE THEORY FOR OBESITY ANALYSIS IN SAUDI ARABIA

Authors

  • Wadha Mohammed T. Alanazi
  • Mohd Aftar Abu Bakar

Keywords:

obesity, SHAP explainability, hybrid tree-ensemble models, extreme value theory, machine learning

DOI:

https://doi.org/10.17654/0973514325026

Abstract

Obesity represents a significant public health crisis in Saudi Arabia, exacerbated by rapid urbanization, dietary shifts, and sedentary lifestyles. Traditional statistical methods often inadequately address the complex, non-linear relationships among obesity determinants. This study proposes a novel machine learning framework integrating hybrid tree-ensemble models - Random Forest (RF) and Gradient Boosting Machine (GBM) - with SHapley Additive exPlanations (SHAP) for interpretability and Extreme Value Theory (EVT) for outlier analysis. Using a nationally representative dataset, we trained and validated models to identify key predictors of obesity (BMI  30) and assess extreme-risk cases. The EVT-augmented hybrid model achieved superior performance (accuracy: 86.2%, MSE: 0.20) compared to baseline RF (82.5%) and GBM (84.3%) models. SHAP analysis revealed BMI  age, and physical inactivity as dominant predictors, while EVT quantified tail risks (shape parameter  in severe obesity. Our approach demonstrates that machine learning, combined with interpretability techniques, can effectively disentangle multifactorial obesity drivers and support targeted interventions. These findings provide a methodological advancement in obesity analytics and offer evidence-based insights for public health policy in Saudi Arabia.

Received: May 15, 2025
Revised: June 20, 2025
Accepted: July 19, 2025

References

[1] E. DeNicola, O. S. Aburizaiza, A. Siddique, H. Khwaja and D. O. Carpenter, Obesity and public health in the Kingdom of Saudi Arabia, Reviews on Environmental Health 30(3) (2015), 191-205.

[2] K. Khalaf, D. M. Mohan, N. El Asswad and F. Al Anouti, Predictive modeling for obesity and overweight in adolescents, current status and application to the MENA region, Adolescent Health in the Middle East and North Africa: An Epidemiological Perspective, Springer, 2022, pp. 71-94.

[3] Z. A. Memish et al., Obesity and associated factors - Kingdom of Saudi Arabia, 2013, Preventing Chronic Disease 11 (2014), 140236.

[4] Y. Alanazi, Implications of lifestyle changes on the incidence of childhood obesity-a systematic review and meta-analysis, European Review for Medical and Pharmacological Sciences 27(16) (2023), 7700-7709.

[5] A. M. Alobaid, W. Syed and M. B. A. Al-Rawi, Factors associated with sedentary behavior and physical activity among people living in Saudi Arabia - a cross-sectional study, Risk Management and Healthcare Policy 16 (2023), 1985-1997.

[6] A. S. Alghamdi et al., Trends in obesity and overweight among Saudi children and adolescents: a meta-analysis, International Journal of Environmental Research and Public Health 18(3) (2021), 1234.

[7] A. O. Musaiger, Overweight and obesity in the eastern Mediterranean region: prevalence and possible causes, Journal of Obesity 2011 (2011), 407237.

[8] S. Alsulami et al., Obesity prevalence, physical activity, and dietary practices among adults in Saudi Arabia, Frontiers in Public Health 11 (2023), 1124051.

[9] A. A. Alhusaini, G. R. Melam and S. Buragadda, Cross-cultural variation in BMI, sedentary behavior, and physical activity in international school girls residing in Saudi Arabia, International Journal of Environmental Research and Public Health 17(6) (2020), 2057.

[10] M. M. Al-Nozha et al., Obesity in Saudi Arabia, Saudi Medical Journal 28(5) (2007), 822-829.

[11] A. Musaiger and M. Al-Mannai, Gender differences in overweight and obesity among adolescents in Kuwait and Saudi Arabia, Nutrition Journal 10(1) (2011), 1 5.

[12] V. Salem, N. AlHusseini, H. I. Abdul Razack, A. Naoum, O. T. Sims and S. A. Alqahtani, Prevalence, risk factors, and interventions for obesity in Saudi Arabia: a systematic review, Obesity Reviews 23(7) (2022), e13448.

[13] A. Alqahtani et al., Prevalence and risk factors of type 2 diabetes in Saudi Arabia: a systematic review and meta-analysis, Saudi Medical Journal 42(4) (2021), 357 365.

[14] F. Alamri et al., The economic impact of obesity in Saudi Arabia, Clinical Obesity 10(5) (2020), e12389.

[15] H. Abubakar, M. Misiran, A. A. I. Sayed and A. B. Karaye, Optimization of Weibull distribution parameters with application to short-term risk assessment and strategic investment decision-making, Statistics, Optimization and Information Computing 12(6) (2024), 1684-1709.

[16] R. Alghamdi et al., Health care cost associated with obesity in Saudi Arabia: a modeling study, PLOS ONE 17(2) (2022), e0263394.

[17] S. Yadav and S. Pal, Machine learning-based approaches for obesity prediction: a systematic review, International Journal of Information Management 44 (2018), 172-185.

[18] H. Abubakar, Random satisfiability logic-driven approach in the Hopfield neural networks with application to covid-19 datasets, International Journal of Applied and Computational Mathematics 11(3) (2025), 117.

[19] M. Belgiu and L. Drăguţ, Random forest in remote sensing: a review of applications and future directions, ISPRS Journal of Photogrammetry and Remote Sensing 114 (2016), 24-31.

[20] A. V. Konstantinov and L. V. Utkin, Interpretable machine learning with an ensemble of gradient boosting machines, Knowledge-Based Systems 222 (2021), 106993.

[21] B. G. Galuzzi, I. Giordani, A. Candelieri, R. Perego and F. Archetti, Hyperparameter optimization for recommender systems through Bayesian optimization, Computational Management Science 17 (2020), 495-515.

[22] A. Movsessian, D. G. Cava and D. Tcherniak, Interpretable machine learning in damage detection using shapley additive explanations, ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part B: Mechanical Engineering 8(2) (2022), 021101.

[23] P. Jonathan, D. Randell, J. Wadsworth and J. Tawn, Uncertainties in return values from extreme value analysis of peaks over threshold using the generalised Pareto distribution, Ocean Engineering 220 (2021), 107725.

[24] M. Karmakar and U. Sharma, Measuring quantile risk hedging effectiveness: a GO-GARCH-EVT-copula approach, Applied Economics 52(48) (2020), 5244-5262.

Published

2025-08-27

Issue

Section

Articles

How to Cite

HYBRID TREE-ENSEMBLE MODELS INTEGRATING EXTREME VALUE THEORY FOR OBESITY ANALYSIS IN SAUDI ARABIA. (2025). JP Journal of Biostatistics, 25(3), 485-504. https://doi.org/10.17654/0973514325026

Similar Articles

1-10 of 42

You may also start an advanced similarity search for this article.