Research Article | | Peer-Reviewed

Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh

Received: 30 January 2025     Accepted: 19 February 2025     Published: 5 March 2025
Views:       Downloads:
Abstract

Analyzing meteorological data in the northern region of Bangladesh is crucial for understanding various aspects influenced by humidity. This study employs machine learning algorithms, including k-nearest neighbor, Classification and Regression Trees, C5.0, Naive Bayes, Random Forest, and Support Vector Machine, to forecast the humidity of northern Bangladesh. Data from 1981 to 2020 from two meteorological stations, Rangpur and Dinajpur, were utilized. Results indicate that Rangpur had the highest average daily humidity (80.34%), while Dinajpur had the lowest (77.26%). Cloud amount correlates positively with humidity and inversely with temperature. The k-nearest neighbor, random forest, and support vector machine algorithms generally revealed better prediction performance than other algorithms. All things considered, the Random Forest model demonstrates superior performance on the testing dataset at both stations, achieving 70% accuracy, F1-score (75.85%), and a kappa value of approximately 53.3% at Rangpur Station, and 74% accuracy, F1-score (78.4%), and a kappa value of approximately 60% at Dinajpur Station. Subsequently, this study analyzes the best performance and accuracy of the random forest classification algorithms through k-fold cross-validation for predicting humidity. With this piece of information, it is anticipated that the study underscores the importance of random forest in predicting humidity and aiding decision-makers in water demand management, ecological balance, and health quality in the northern region of Bangladesh.

Published in American Journal of Data Mining and Knowledge Discovery (Volume 10, Issue 1)
DOI 10.11648/j.ajdmkd.20251001.11
Page(s) 1-19
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2025. Published by Science Publishing Group

Keywords

Machine Learning, Cross Validation, Classification, Climate, Humidity, Bangladesh

References
[1] Islam, M. M., 2014. Regional Differentials of Annual Average Humidity over Bangladesh. ASA University Review, 8(1), pp. 1-14.
[2] Abu-Taleb, A. A., Alawneh, A. J. and Smadi, M. M., 2007. Statistical analysis of recent changes in relative humidity in Jordan. American Journal of Environmental Sciences, 3(2), pp. 75-77.
[3] Arundel, A. V., Sterling, E. M., Biggin, J. H. and Sterling, T. D., 1986. Indirect health effects of relative humidity in indoor environments. Environmental health perspectives, 65, pp. 351-361.
[4] Salim, M. J. N. P., 1989. Effects of salinity and relative humidity on growth and ionic relations of plants. New Phytologist, 113(1), pp. 13-20.
[5] Assmann, S. M. and Grantz, D. A., 1990. The magnitude of the stomatal response to blue light: modulation by atmospheric humidity. Plant Physiology, 93(2), pp. 701- 707.
[6] Chowdhury, M., Mondal, S. and Islam, J., 2018. Modeling and forecasting humidity in Bangladesh: box-jenkins approach. International Journal of Research, 6(4), pp. 50-60,
[7] Ruane, A. C., Major, D. C., Winston, H. Y., Alam, M., Hussain, S. G., Khan, A. S., Hassan, A., Al Hossain, B. M. T., Goldberg, R., Horton, R. M. and Rosenzweig, C., 2013. Multi-factor impact analysis of agricultural production in Bangladesh with climate change. Global environmental change, 23(1), pp. 338- 350,
[8] Rahman, M. H., Hossain, M. M., 2019. Classification and regression tree to predict the precipitation labels of north-west region in Bangladesh. Environment and Natural Resources Research, 9(3), pp. 117-126,
[9] Rahman, M. H., Matin, M., Salma, U., 2018. Analysis of precipitation data in Bangladesh through hierarchical clustering and multidimensional scaling. Theoretical and Applied Climatology 134, pp. 689-705,
[10] Rahman, M. H., 2022. Prediction of homogeneous region over Bangladesh based on temperature: a non-hierarchical clustering approach. Theoretical and Applied Climatology, 148(3-4), pp. 1127-1149.
[11] Ridwan, W. M., Sapitang, M., Aziz, A., Kushiar, K. F., Ahmed, A. N. and El-Shafie, A., 2021. Rainfall forecasting model using machine learning methods: Case study Terengganu, Malaysia. Ain Shams Engineering Journal, 12(2), pp. 1651-1663.
[12] Yamac, S. S. and Todorovic, M., 2020. Estimation of daily potato crop evapotranspiration using three different machine learning algorithms and four scenarios of available meteorological data. Agricultural Water Management, 228, p. 105875.
[13] Breiman, L., Friedman, J. H., Olshen, R. A., Stone, C. J., 2017. Classification and Regression Trees. Routledge.
[14] Ghiasi, M. M., Zendehboudi, S. and Mohsenipour, A. A., 2020. Decision tree-based diagnosis of coronary artery disease: CART model. Computer methods and programs in biomedicine, 192, p. 105400.
[15] Atkinson, E. J., Therneau, T. M., 2000. An introduction to recursive partitioning using the rpart routines. Rochester: Mayo Foundation.
[16] Quinlan, J. R., 1986. Induction of decision trees. Machine learning, 1, pp. 81-106.
[17] Williams, N., Zander, S. and Armitage, G., 2006. A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification. ACM SIGCOMM Computer Communication Review, 36(5), pp. 5-16.
[18] Ray, S., 2019. February. A quick review of machine learning algorithms. In 2019 International Conference on Machine Learning, Big Data, cloud and Parallel Computing (COMITCon) (pp. 35-39). IEEE.
[19] Parthiban, G., Rajesh, A. and Srivatsa, S. K., 2011. Diagnosis of heart disease for diabetic patients using naive Bayes method. International Journal of Computer Applications, 24(3), pp. 7-11.
[20] Breiman, L., 2001. Random forests. Machine learning, 45, pp. 5-32.
[21] Hastie, T., 2009. The elements of statistical learning: data mining, inference, and prediction.
[22] Xu, W., Zhang, J., Zhang, Q. and Wei, X., 2017, February. Risk prediction of type II diabetes based on random forest model. In 2017 third International Conference on advances in electrical, electronics, information, communication and bio- informatics (AEEICB), pp. 382-386). IEEE.
[23] Ukil, A. and Ukil, A., 2007. Support vector machine. Intelligent systems and signal processing in power engineering, pp. 161-226.
[24] Suykens, J. A., De Brabanter, J., Lukas, L. and Vandewalle, J., 2002. Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing, 48(1-4), pp. 85-105.
[25] Rohani, A., Taki, M. and Abdollahpour, M., 2018. A novel soft computing model (Gaussian process regression with K-fold cross validation) for daily and monthly solar radiation forecasting (Part: I). Renewable Energy, 115, pp. 411-422.
[26] Borna, N. J. and Rahman, M. H., 2024. Evaluating the degree of cloudiness using machine learning techniques based on different atmospheric conditions. Theoretical and Applied Climatology, pp. 1-30.
[27] Rahman, M. H., 2024. ANN-based and DT- based Classification Approaches to Predict the Rainfall Level of the Grid (90°E − 92°E, 23°N − 25°N) in Bangladesh. International Journal of Data Science and Analysis, 10(6), pp. 109-128.
Cite This Article
  • APA Style

    Akter, M. R., Rahman, M. H. (2025). Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh. American Journal of Data Mining and Knowledge Discovery, 10(1), 1-19. https://doi.org/10.11648/j.ajdmkd.20251001.11

    Copy | Download

    ACS Style

    Akter, M. R.; Rahman, M. H. Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh. Am. J. Data Min. Knowl. Discov. 2025, 10(1), 1-19. doi: 10.11648/j.ajdmkd.20251001.11

    Copy | Download

    AMA Style

    Akter MR, Rahman MH. Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh. Am J Data Min Knowl Discov. 2025;10(1):1-19. doi: 10.11648/j.ajdmkd.20251001.11

    Copy | Download

  • @article{10.11648/j.ajdmkd.20251001.11,
      author = {Most. Rubina Akter and Md. Habibur Rahman},
      title = {Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh},
      journal = {American Journal of Data Mining and Knowledge Discovery},
      volume = {10},
      number = {1},
      pages = {1-19},
      doi = {10.11648/j.ajdmkd.20251001.11},
      url = {https://doi.org/10.11648/j.ajdmkd.20251001.11},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajdmkd.20251001.11},
      abstract = {Analyzing meteorological data in the northern region of Bangladesh is crucial for understanding various aspects influenced by humidity. This study employs machine learning algorithms, including k-nearest neighbor, Classification and Regression Trees, C5.0, Naive Bayes, Random Forest, and Support Vector Machine, to forecast the humidity of northern Bangladesh. Data from 1981 to 2020 from two meteorological stations, Rangpur and Dinajpur, were utilized. Results indicate that Rangpur had the highest average daily humidity (80.34%), while Dinajpur had the lowest (77.26%). Cloud amount correlates positively with humidity and inversely with temperature. The k-nearest neighbor, random forest, and support vector machine algorithms generally revealed better prediction performance than other algorithms. All things considered, the Random Forest model demonstrates superior performance on the testing dataset at both stations, achieving 70% accuracy, F1-score (75.85%), and a kappa value of approximately 53.3% at Rangpur Station, and 74% accuracy, F1-score (78.4%), and a kappa value of approximately 60% at Dinajpur Station. Subsequently, this study analyzes the best performance and accuracy of the random forest classification algorithms through k-fold cross-validation for predicting humidity. With this piece of information, it is anticipated that the study underscores the importance of random forest in predicting humidity and aiding decision-makers in water demand management, ecological balance, and health quality in the northern region of Bangladesh.},
     year = {2025}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh
    AU  - Most. Rubina Akter
    AU  - Md. Habibur Rahman
    Y1  - 2025/03/05
    PY  - 2025
    N1  - https://doi.org/10.11648/j.ajdmkd.20251001.11
    DO  - 10.11648/j.ajdmkd.20251001.11
    T2  - American Journal of Data Mining and Knowledge Discovery
    JF  - American Journal of Data Mining and Knowledge Discovery
    JO  - American Journal of Data Mining and Knowledge Discovery
    SP  - 1
    EP  - 19
    PB  - Science Publishing Group
    SN  - 2578-7837
    UR  - https://doi.org/10.11648/j.ajdmkd.20251001.11
    AB  - Analyzing meteorological data in the northern region of Bangladesh is crucial for understanding various aspects influenced by humidity. This study employs machine learning algorithms, including k-nearest neighbor, Classification and Regression Trees, C5.0, Naive Bayes, Random Forest, and Support Vector Machine, to forecast the humidity of northern Bangladesh. Data from 1981 to 2020 from two meteorological stations, Rangpur and Dinajpur, were utilized. Results indicate that Rangpur had the highest average daily humidity (80.34%), while Dinajpur had the lowest (77.26%). Cloud amount correlates positively with humidity and inversely with temperature. The k-nearest neighbor, random forest, and support vector machine algorithms generally revealed better prediction performance than other algorithms. All things considered, the Random Forest model demonstrates superior performance on the testing dataset at both stations, achieving 70% accuracy, F1-score (75.85%), and a kappa value of approximately 53.3% at Rangpur Station, and 74% accuracy, F1-score (78.4%), and a kappa value of approximately 60% at Dinajpur Station. Subsequently, this study analyzes the best performance and accuracy of the random forest classification algorithms through k-fold cross-validation for predicting humidity. With this piece of information, it is anticipated that the study underscores the importance of random forest in predicting humidity and aiding decision-makers in water demand management, ecological balance, and health quality in the northern region of Bangladesh.
    VL  - 10
    IS  - 1
    ER  - 

    Copy | Download

Author Information
  • Sections