Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh

Most. Rubina Akter; Md. Habibur Rahman

doi:doi:10.11648/j.ajdmkd.20251001.11

Research Article |

| Peer-Reviewed

Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh

Most. Rubina Akter

, Md. Habibur Rahman^*

Published in American Journal of Data Mining and Knowledge Discovery (Volume 10, Issue 1)

Received: 30 January 2025 Accepted: 19 February 2025 Published: 5 March 2025

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Analyzing meteorological data in the northern region of Bangladesh is crucial for understanding various aspects influenced by humidity. This study employs machine learning algorithms, including k-nearest neighbor, Classification and Regression Trees, C5.0, Naive Bayes, Random Forest, and Support Vector Machine, to forecast the humidity of northern Bangladesh. Data from 1981 to 2020 from two meteorological stations, Rangpur and Dinajpur, were utilized. Results indicate that Rangpur had the highest average daily humidity (80.34%), while Dinajpur had the lowest (77.26%). Cloud amount correlates positively with humidity and inversely with temperature. The k-nearest neighbor, random forest, and support vector machine algorithms generally revealed better prediction performance than other algorithms. All things considered, the Random Forest model demonstrates superior performance on the testing dataset at both stations, achieving 70% accuracy, F₁-score (75.85%), and a kappa value of approximately 53.3% at Rangpur Station, and 74% accuracy, F₁-score (78.4%), and a kappa value of approximately 60% at Dinajpur Station. Subsequently, this study analyzes the best performance and accuracy of the random forest classification algorithms through k-fold cross-validation for predicting humidity. With this piece of information, it is anticipated that the study underscores the importance of random forest in predicting humidity and aiding decision-makers in water demand management, ecological balance, and health quality in the northern region of Bangladesh.

Published in	American Journal of Data Mining and Knowledge Discovery (Volume 10, Issue 1)
DOI	10.11648/j.ajdmkd.20251001.11
Page(s)	1-19
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2025. Published by Science Publishing Group

Keywords

Machine Learning, Cross Validation, Classification, Climate, Humidity, Bangladesh

References

[1]	Islam, M. M., 2014. Regional Differentials of Annual Average Humidity over Bangladesh. ASA University Review, 8(1), pp. 1-14.
[2]	Abu-Taleb, A. A., Alawneh, A. J. and Smadi, M. M., 2007. Statistical analysis of recent changes in relative humidity in Jordan. American Journal of Environmental Sciences, 3(2), pp. 75-77.
[3]	Arundel, A. V., Sterling, E. M., Biggin, J. H. and Sterling, T. D., 1986. Indirect health effects of relative humidity in indoor environments. Environmental health perspectives, 65, pp. 351-361. https://doi.org/10.1289/ehp.8665351
[4]	Salim, M. J. N. P., 1989. Effects of salinity and relative humidity on growth and ionic relations of plants. New Phytologist, 113(1), pp. 13-20. https://doi.org/10.1111/j.1469-8137.1989.tb02390
[5]	Assmann, S. M. and Grantz, D. A., 1990. The magnitude of the stomatal response to blue light: modulation by atmospheric humidity. Plant Physiology, 93(2), pp. 701- 707. https://doi.org/10.1104/pp.93.2.701
[6]	Chowdhury, M., Mondal, S. and Islam, J., 2018. Modeling and forecasting humidity in Bangladesh: box-jenkins approach. International Journal of Research, 6(4), pp. 50-60, https://doi.org/10.29121/granthaalayah.v6.i4.2018.1475
[7]	Ruane, A. C., Major, D. C., Winston, H. Y., Alam, M., Hussain, S. G., Khan, A. S., Hassan, A., Al Hossain, B. M. T., Goldberg, R., Horton, R. M. and Rosenzweig, C., 2013. Multi-factor impact analysis of agricultural production in Bangladesh with climate change. Global environmental change, 23(1), pp. 338- 350, https://doi.org/10.1016/j.gloenvcha.2012.09.001
[8]	Rahman, M. H., Hossain, M. M., 2019. Classification and regression tree to predict the precipitation labels of north-west region in Bangladesh. Environment and Natural Resources Research, 9(3), pp. 117-126, https://doi.org/10.5539/enrr.v9n3p117
[9]	Rahman, M. H., Matin, M., Salma, U., 2018. Analysis of precipitation data in Bangladesh through hierarchical clustering and multidimensional scaling. Theoretical and Applied Climatology 134, pp. 689-705, https://doi.org/10.1007/s00704-017-2319-y
[10]	Rahman, M. H., 2022. Prediction of homogeneous region over Bangladesh based on temperature: a non-hierarchical clustering approach. Theoretical and Applied Climatology, 148(3-4), pp. 1127-1149. https://doi.org/10.1007/s00704-022-03955-3
[11]	Ridwan, W. M., Sapitang, M., Aziz, A., Kushiar, K. F., Ahmed, A. N. and El-Shafie, A., 2021. Rainfall forecasting model using machine learning methods: Case study Terengganu, Malaysia. Ain Shams Engineering Journal, 12(2), pp. 1651-1663. https://doi.org/10.1016/j.asej.2020.09.011
[12]	Yamac, S. S. and Todorovic, M., 2020. Estimation of daily potato crop evapotranspiration using three different machine learning algorithms and four scenarios of available meteorological data. Agricultural Water Management, 228, p. 105875. https://doi.org/10.1016/j.agwat.2019.105875
[13]	Breiman, L., Friedman, J. H., Olshen, R. A., Stone, C. J., 2017. Classification and Regression Trees. Routledge. https://doi.org/10.1201/9781315139470
[14]	Ghiasi, M. M., Zendehboudi, S. and Mohsenipour, A. A., 2020. Decision tree-based diagnosis of coronary artery disease: CART model. Computer methods and programs in biomedicine, 192, p. 105400. https://doi.org/10.1016/j.cmpb.2020.105400
[15]	Atkinson, E. J., Therneau, T. M., 2000. An introduction to recursive partitioning using the rpart routines. Rochester: Mayo Foundation.
[16]	Quinlan, J. R., 1986. Induction of decision trees. Machine learning, 1, pp. 81-106. https://doi.org/10.1007/BF00116251
[17]	Williams, N., Zander, S. and Armitage, G., 2006. A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification. ACM SIGCOMM Computer Communication Review, 36(5), pp. 5-16. https://doi.org/10.1145/1163593.1163596
[18]	Ray, S., 2019. February. A quick review of machine learning algorithms. In 2019 International Conference on Machine Learning, Big Data, cloud and Parallel Computing (COMITCon) (pp. 35-39). IEEE. https://doi.org/10.1109/COMITCon.2019.8862451
[19]	Parthiban, G., Rajesh, A. and Srivatsa, S. K., 2011. Diagnosis of heart disease for diabetic patients using naive Bayes method. International Journal of Computer Applications, 24(3), pp. 7-11. https://doi.org/10.5120/2933-3887
[20]	Breiman, L., 2001. Random forests. Machine learning, 45, pp. 5-32. https://doi.org/10.1023/A:1010933404324
[21]	Hastie, T., 2009. The elements of statistical learning: data mining, inference, and prediction. https://doi.org/10.1111/j.1541-0420.2010.01516.x
[22]	Xu, W., Zhang, J., Zhang, Q. and Wei, X., 2017, February. Risk prediction of type II diabetes based on random forest model. In 2017 third International Conference on advances in electrical, electronics, information, communication and bio- informatics (AEEICB), pp. 382-386). IEEE. https://doi.org/10.1109/AEEICB.2017.7972337
[23]	Ukil, A. and Ukil, A., 2007. Support vector machine. Intelligent systems and signal processing in power engineering, pp. 161-226. https://doi.org/10.1007/978-3-540-73170-24
[24]	Suykens, J. A., De Brabanter, J., Lukas, L. and Vandewalle, J., 2002. Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing, 48(1-4), pp. 85-105. https://doi.org/10.1016/S0925-2312(01)00644-0
[25]	Rohani, A., Taki, M. and Abdollahpour, M., 2018. A novel soft computing model (Gaussian process regression with K-fold cross validation) for daily and monthly solar radiation forecasting (Part: I). Renewable Energy, 115, pp. 411-422. https://doi.org/10.1016/j.renene.2017.08.061
[26]	Borna, N. J. and Rahman, M. H., 2024. Evaluating the degree of cloudiness using machine learning techniques based on different atmospheric conditions. Theoretical and Applied Climatology, pp. 1-30. https://doi.org/10.1007/s00704-024-05062-x
[27]	Rahman, M. H., 2024. ANN-based and DT- based Classification Approaches to Predict the Rainfall Level of the Grid (90°E − 92°E, 23°N − 25°N) in Bangladesh. International Journal of Data Science and Analysis, 10(6), pp. 109-128. https://doi.org/10.11648/j.ijdsa.20241006.11

Cite This Article

Plain Text BibTeX RIS

APA Style

Akter, M. R., Rahman, M. H. (2025). Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh. American Journal of Data Mining and Knowledge Discovery, 10(1), 1-19. https://doi.org/10.11648/j.ajdmkd.20251001.11

Copy | Download

ACS Style

Akter, M. R.; Rahman, M. H. Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh. Am. J. Data Min. Knowl. Discov. 2025, 10(1), 1-19. doi: 10.11648/j.ajdmkd.20251001.11

Copy | Download

AMA Style

Akter MR, Rahman MH. Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh. Am J Data Min Knowl Discov. 2025;10(1):1-19. doi: 10.11648/j.ajdmkd.20251001.11

Copy | Download

@article{10.11648/j.ajdmkd.20251001.11,
  author = {Most. Rubina Akter and Md. Habibur Rahman},
  title = {Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh},
  journal = {American Journal of Data Mining and Knowledge Discovery},
  volume = {10},
  number = {1},
  pages = {1-19},
  doi = {10.11648/j.ajdmkd.20251001.11},
  url = {https://doi.org/10.11648/j.ajdmkd.20251001.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajdmkd.20251001.11},
  abstract = {Analyzing meteorological data in the northern region of Bangladesh is crucial for understanding various aspects influenced by humidity. This study employs machine learning algorithms, including k-nearest neighbor, Classification and Regression Trees, C5.0, Naive Bayes, Random Forest, and Support Vector Machine, to forecast the humidity of northern Bangladesh. Data from 1981 to 2020 from two meteorological stations, Rangpur and Dinajpur, were utilized. Results indicate that Rangpur had the highest average daily humidity (80.34%), while Dinajpur had the lowest (77.26%). Cloud amount correlates positively with humidity and inversely with temperature. The k-nearest neighbor, random forest, and support vector machine algorithms generally revealed better prediction performance than other algorithms. All things considered, the Random Forest model demonstrates superior performance on the testing dataset at both stations, achieving 70% accuracy, F1-score (75.85%), and a kappa value of approximately 53.3% at Rangpur Station, and 74% accuracy, F1-score (78.4%), and a kappa value of approximately 60% at Dinajpur Station. Subsequently, this study analyzes the best performance and accuracy of the random forest classification algorithms through k-fold cross-validation for predicting humidity. With this piece of information, it is anticipated that the study underscores the importance of random forest in predicting humidity and aiding decision-makers in water demand management, ecological balance, and health quality in the northern region of Bangladesh.},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh
AU  - Most. Rubina Akter
AU  - Md. Habibur Rahman
Y1  - 2025/03/05
PY  - 2025
N1  - https://doi.org/10.11648/j.ajdmkd.20251001.11
DO  - 10.11648/j.ajdmkd.20251001.11
T2  - American Journal of Data Mining and Knowledge Discovery
JF  - American Journal of Data Mining and Knowledge Discovery
JO  - American Journal of Data Mining and Knowledge Discovery
SP  - 1
EP  - 19
PB  - Science Publishing Group
SN  - 2578-7837
UR  - https://doi.org/10.11648/j.ajdmkd.20251001.11
AB  - Analyzing meteorological data in the northern region of Bangladesh is crucial for understanding various aspects influenced by humidity. This study employs machine learning algorithms, including k-nearest neighbor, Classification and Regression Trees, C5.0, Naive Bayes, Random Forest, and Support Vector Machine, to forecast the humidity of northern Bangladesh. Data from 1981 to 2020 from two meteorological stations, Rangpur and Dinajpur, were utilized. Results indicate that Rangpur had the highest average daily humidity (80.34%), while Dinajpur had the lowest (77.26%). Cloud amount correlates positively with humidity and inversely with temperature. The k-nearest neighbor, random forest, and support vector machine algorithms generally revealed better prediction performance than other algorithms. All things considered, the Random Forest model demonstrates superior performance on the testing dataset at both stations, achieving 70% accuracy, F1-score (75.85%), and a kappa value of approximately 53.3% at Rangpur Station, and 74% accuracy, F1-score (78.4%), and a kappa value of approximately 60% at Dinajpur Station. Subsequently, this study analyzes the best performance and accuracy of the random forest classification algorithms through k-fold cross-validation for predicting humidity. With this piece of information, it is anticipated that the study underscores the importance of random forest in predicting humidity and aiding decision-makers in water demand management, ecological balance, and health quality in the northern region of Bangladesh.
VL  - 10
IS  - 1
ER  -

Copy | Download

Author Information

Most. Rubina Akter

Department of Statistics and Data Science, Jahangirnagar University, Dhaka, Bangladesh

Contact Email

http://orcid.org/0009-0006-5151-4388
Md. Habibur Rahman

Department of Statistics and Data Science, Jahangirnagar University, Dhaka, Bangladesh

Contact Email

http://orcid.org/0000-0002-3972-3711

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Akter, M. R., Rahman, M. H. (2025). Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh. American Journal of Data Mining and Knowledge Discovery, 10(1), 1-19. https://doi.org/10.11648/j.ajdmkd.20251001.11

Copy | Download

ACS Style

Akter, M. R.; Rahman, M. H. Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh. Am. J. Data Min. Knowl. Discov. 2025, 10(1), 1-19. doi: 10.11648/j.ajdmkd.20251001.11

Copy | Download

AMA Style

Akter MR, Rahman MH. Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh. Am J Data Min Knowl Discov. 2025;10(1):1-19. doi: 10.11648/j.ajdmkd.20251001.11

Copy | Download

@article{10.11648/j.ajdmkd.20251001.11,
  author = {Most. Rubina Akter and Md. Habibur Rahman},
  title = {Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh},
  journal = {American Journal of Data Mining and Knowledge Discovery},
  volume = {10},
  number = {1},
  pages = {1-19},
  doi = {10.11648/j.ajdmkd.20251001.11},
  url = {https://doi.org/10.11648/j.ajdmkd.20251001.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajdmkd.20251001.11},
  abstract = {Analyzing meteorological data in the northern region of Bangladesh is crucial for understanding various aspects influenced by humidity. This study employs machine learning algorithms, including k-nearest neighbor, Classification and Regression Trees, C5.0, Naive Bayes, Random Forest, and Support Vector Machine, to forecast the humidity of northern Bangladesh. Data from 1981 to 2020 from two meteorological stations, Rangpur and Dinajpur, were utilized. Results indicate that Rangpur had the highest average daily humidity (80.34%), while Dinajpur had the lowest (77.26%). Cloud amount correlates positively with humidity and inversely with temperature. The k-nearest neighbor, random forest, and support vector machine algorithms generally revealed better prediction performance than other algorithms. All things considered, the Random Forest model demonstrates superior performance on the testing dataset at both stations, achieving 70% accuracy, F1-score (75.85%), and a kappa value of approximately 53.3% at Rangpur Station, and 74% accuracy, F1-score (78.4%), and a kappa value of approximately 60% at Dinajpur Station. Subsequently, this study analyzes the best performance and accuracy of the random forest classification algorithms through k-fold cross-validation for predicting humidity. With this piece of information, it is anticipated that the study underscores the importance of random forest in predicting humidity and aiding decision-makers in water demand management, ecological balance, and health quality in the northern region of Bangladesh.},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - Analysis of Climatic Factors and Utilization of Machine Learning Techniques to Anticipate Humidity Levels in Northern Bangladesh
AU  - Most. Rubina Akter
AU  - Md. Habibur Rahman
Y1  - 2025/03/05
PY  - 2025
N1  - https://doi.org/10.11648/j.ajdmkd.20251001.11
DO  - 10.11648/j.ajdmkd.20251001.11
T2  - American Journal of Data Mining and Knowledge Discovery
JF  - American Journal of Data Mining and Knowledge Discovery
JO  - American Journal of Data Mining and Knowledge Discovery
SP  - 1
EP  - 19
PB  - Science Publishing Group
SN  - 2578-7837
UR  - https://doi.org/10.11648/j.ajdmkd.20251001.11
AB  - Analyzing meteorological data in the northern region of Bangladesh is crucial for understanding various aspects influenced by humidity. This study employs machine learning algorithms, including k-nearest neighbor, Classification and Regression Trees, C5.0, Naive Bayes, Random Forest, and Support Vector Machine, to forecast the humidity of northern Bangladesh. Data from 1981 to 2020 from two meteorological stations, Rangpur and Dinajpur, were utilized. Results indicate that Rangpur had the highest average daily humidity (80.34%), while Dinajpur had the lowest (77.26%). Cloud amount correlates positively with humidity and inversely with temperature. The k-nearest neighbor, random forest, and support vector machine algorithms generally revealed better prediction performance than other algorithms. All things considered, the Random Forest model demonstrates superior performance on the testing dataset at both stations, achieving 70% accuracy, F1-score (75.85%), and a kappa value of approximately 53.3% at Rangpur Station, and 74% accuracy, F1-score (78.4%), and a kappa value of approximately 60% at Dinajpur Station. Subsequently, this study analyzes the best performance and accuracy of the random forest classification algorithms through k-fold cross-validation for predicting humidity. With this piece of information, it is anticipated that the study underscores the importance of random forest in predicting humidity and aiding decision-makers in water demand management, ecological balance, and health quality in the northern region of Bangladesh.
VL  - 10
IS  - 1
ER  -

Copy | Download