Nowadays, water shortage is increasingly severe, which has huge negative influence on daily life. Constructing hydropower engineering is one of the approaches to alleviate such problem. Therefore, it’s worth settling technical problems of hydropower engineering timely, which will help people not only make better use of water resources but also get rid of various security risks. To achieve such goal, this study predicts potential technical problems that hydropower engineering might happen. In order to utilize the large amount of data, data mining techniques are used to solve this multi-classification problem. First of all, plenty of data is preprocessed. Particularly, because of the complexity of text data, text mining techniques are applied to transform the unstructured data to structural data. Then, eXtreme Gradient Boosting (XGBoost) is applied to make the classification. To validate efficiency of the model, comparisons are made among XGBoost, Gradient Boosting Decision Tree, Random Forest, Decision Tree, k-Nearest Neighbor and Bernoulli Naïve Bayes from the perspective of accuracy, precision, recall and f-score. The experimental result shows that XGBoost is more suitable to solve this classification problem. This study provides engineering inspectors with helpful suggestions of particular technical problems that need attention, and further enables people to inspect engineering more efficiently and effectively.
Published in | Science Journal of Applied Mathematics and Statistics (Volume 6, Issue 4) |
DOI | 10.11648/j.sjams.20180604.13 |
Page(s) | 124-129 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2018. Published by Science Publishing Group |
Data Mining, Hydropower Engineering, Multi-classification Problem, eXtreme Gradient Boosting
[1] | Kowalczykjuśko, A., Mazur, A., Grzywna, A., et al. (2017). Evaluation of the possibilities of using water-damming devices on the Tyśmienica River to build small hydropower plants. Journal of Water and Land Development, 35(1), 113-119. |
[2] | Qin, P., & Cheng, C. (2017). Prediction of seawall settlement based on a combined LS-ARIMA model. Mathematical Problems in Engineering, 2017, Article ID: 7840569. |
[3] | Sojka, M., Jaskula, J., Wicher-Dysarz, J., et al. (2016). Assessment of dam construction impact on hydrological regime changes in lowland river - a case of study: the Stare Miasto reservoir located on the Powa River. Journal of Water and Land Development, 30(1), 119-125. |
[4] | Sadaoui, M., Ludwig, W., Bourrin, F., et al. (2018). The impact of reservoir construction on riverine sediment and carbon fluxes to the Mediterranean Sea. Progress in Oceanography, 163, 94-111. |
[5] | Yaeger, M. A., Massey, J. H., Reba, M. L., et al. (2018). Trends in the construction of on-farm irrigation reservoirs in response to aquifer decline in eastern Arkansas: implications for conjunctive water resource management. Agricultural Water Management, 208, 373-383. |
[6] | Ghimire, B. S., & Jangareddy, M. (2013). Optimal reservoir operation for hydropower production using particle swarm optimization and sustainability analysis of hydropower. ISH Journal of Hydraulic Engineering, 19(3), 196-210. |
[7] | Naumann, S., Schwanenberg, D., Karimanzira, D., et al. (2015). Short-term management of hydropower reservoirs under meteorological uncertainty by means of multi-stage optimization. AT - Automatisierungstechnik, 63(7), 535-542. |
[8] | Su, H., Li, X., Yang, B., et al. (2018). Wavelet support vector machine-based prediction model of dam deformation. Mechanical Systems and Signal Processing, 110, 412-427. |
[9] | Zhong, D., Du, R., Cui, B., et al. (2018). Real-time spreading thickness monitoring of high-core rockfill dam based on k - Nearest Neighbor algorithm. Transactions of Tianjin University, 24(3), 282-289. |
[10] | Valero, C. S. (2016). Predicting win-loss outcomes in MLB regular season games-a comparative study using data mining methods. International Journal of Computer Science in Sport, 15(2), 91-112. |
[11] | Yukselturk, E., Ozekes, S., & Turel, Y. K. (2014). Predicting dropout student: an application of data mining methods in an online education program. European Journal of Open, Distance and E-Learning, 17(1), 118-133. |
[12] | Shingari, I., Kumar, D., & Khetan, M. (2017). A review of applications of data mining techniques for prediction of students’ performance in higher education. Journal of Statistics and Management Systems, 20(4), 713-722. |
[13] | Sun, J., & Li, H. (2008). Data mining method for listed companies’ financial distress prediction. Knowledge-Based Systems, 21(1), 1-5. |
[14] | Xu, W., Li, Z., Cheng, C., et al. (2013). Data mining for unemployment rate prediction using search engine query data. Service Oriented Computing and Applications, 7(1), 33-42. |
[15] | Cobaner, M., Haktanir, T., & Kisi, O. (2008). Prediction of hydropower energy using ANN for the feasibility of hydropower plant installation to an existing irrigation dam. Water Resources Management, 22(6), 757-774. |
[16] | Su, H., Hu, J., Yang, M., et al. (2015). Assessment and prediction for service life of water resources and hydropower engineering. Natural Hazards, 75(3), 3005-3019. |
[17] | Jiang, C., Sheng, J., Zhang, G., et al. (2012). Calculation of failure probability of hydraulic structures for rural hydropower. Procedia Engineering, 28, 161-164. |
[18] | Shi, L. L. (2014). Prediction model for mark-up of water conservancy projects based on PCA-ANN. Journal of Economics of Water Resources, 32(3), 52-55. |
[19] | Fawcett, T., & Provost, F. (2015). Data Science for Business. USA: O'Reilly Media, Inc. |
[20] | Chen, T. Q., & Guestrin, C. (2016). XGBoost: a scalable tree boosting system. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 13-August 17, San Francisco, USA, pp. 785-794. |
APA Style
Jing Zhu, Yi Chen, Liming Huang, Chunyong She, Yangfeng Wu, et al. (2018). Predicting Technical Problems of Hydropower Engineering Using eXtreme Gradient Boosting. Science Journal of Applied Mathematics and Statistics, 6(4), 124-129. https://doi.org/10.11648/j.sjams.20180604.13
ACS Style
Jing Zhu; Yi Chen; Liming Huang; Chunyong She; Yangfeng Wu, et al. Predicting Technical Problems of Hydropower Engineering Using eXtreme Gradient Boosting. Sci. J. Appl. Math. Stat. 2018, 6(4), 124-129. doi: 10.11648/j.sjams.20180604.13
AMA Style
Jing Zhu, Yi Chen, Liming Huang, Chunyong She, Yangfeng Wu, et al. Predicting Technical Problems of Hydropower Engineering Using eXtreme Gradient Boosting. Sci J Appl Math Stat. 2018;6(4):124-129. doi: 10.11648/j.sjams.20180604.13
@article{10.11648/j.sjams.20180604.13, author = {Jing Zhu and Yi Chen and Liming Huang and Chunyong She and Yangfeng Wu and Wenyu Zhang}, title = {Predicting Technical Problems of Hydropower Engineering Using eXtreme Gradient Boosting}, journal = {Science Journal of Applied Mathematics and Statistics}, volume = {6}, number = {4}, pages = {124-129}, doi = {10.11648/j.sjams.20180604.13}, url = {https://doi.org/10.11648/j.sjams.20180604.13}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.sjams.20180604.13}, abstract = {Nowadays, water shortage is increasingly severe, which has huge negative influence on daily life. Constructing hydropower engineering is one of the approaches to alleviate such problem. Therefore, it’s worth settling technical problems of hydropower engineering timely, which will help people not only make better use of water resources but also get rid of various security risks. To achieve such goal, this study predicts potential technical problems that hydropower engineering might happen. In order to utilize the large amount of data, data mining techniques are used to solve this multi-classification problem. First of all, plenty of data is preprocessed. Particularly, because of the complexity of text data, text mining techniques are applied to transform the unstructured data to structural data. Then, eXtreme Gradient Boosting (XGBoost) is applied to make the classification. To validate efficiency of the model, comparisons are made among XGBoost, Gradient Boosting Decision Tree, Random Forest, Decision Tree, k-Nearest Neighbor and Bernoulli Naïve Bayes from the perspective of accuracy, precision, recall and f-score. The experimental result shows that XGBoost is more suitable to solve this classification problem. This study provides engineering inspectors with helpful suggestions of particular technical problems that need attention, and further enables people to inspect engineering more efficiently and effectively.}, year = {2018} }
TY - JOUR T1 - Predicting Technical Problems of Hydropower Engineering Using eXtreme Gradient Boosting AU - Jing Zhu AU - Yi Chen AU - Liming Huang AU - Chunyong She AU - Yangfeng Wu AU - Wenyu Zhang Y1 - 2018/10/18 PY - 2018 N1 - https://doi.org/10.11648/j.sjams.20180604.13 DO - 10.11648/j.sjams.20180604.13 T2 - Science Journal of Applied Mathematics and Statistics JF - Science Journal of Applied Mathematics and Statistics JO - Science Journal of Applied Mathematics and Statistics SP - 124 EP - 129 PB - Science Publishing Group SN - 2376-9513 UR - https://doi.org/10.11648/j.sjams.20180604.13 AB - Nowadays, water shortage is increasingly severe, which has huge negative influence on daily life. Constructing hydropower engineering is one of the approaches to alleviate such problem. Therefore, it’s worth settling technical problems of hydropower engineering timely, which will help people not only make better use of water resources but also get rid of various security risks. To achieve such goal, this study predicts potential technical problems that hydropower engineering might happen. In order to utilize the large amount of data, data mining techniques are used to solve this multi-classification problem. First of all, plenty of data is preprocessed. Particularly, because of the complexity of text data, text mining techniques are applied to transform the unstructured data to structural data. Then, eXtreme Gradient Boosting (XGBoost) is applied to make the classification. To validate efficiency of the model, comparisons are made among XGBoost, Gradient Boosting Decision Tree, Random Forest, Decision Tree, k-Nearest Neighbor and Bernoulli Naïve Bayes from the perspective of accuracy, precision, recall and f-score. The experimental result shows that XGBoost is more suitable to solve this classification problem. This study provides engineering inspectors with helpful suggestions of particular technical problems that need attention, and further enables people to inspect engineering more efficiently and effectively. VL - 6 IS - 4 ER -