This paper presents an intelligent vehicle fault diagnostics system, SeaProSel(Search-Prompt-Select). SeaProSel takes a casual description of vehicle problems as input and searches for a diagnostic code that accurately matches the problem description. SeaProSel was developed using automatic text classification and machine learning techniques combined with a prompt-and-select technique based on the vehicle diagnostic engineering structure to provide robust classification of the diagnostic code that accurately matches the problem description. Machine learning algorithms are developed to automatically learn words and terms, and their variations commonly used in verbal descriptions of vehicle problems, and to build a TCW(Term-Code-Weight) matrix that is used for measuring similarity between a document vector and a diagnostic code class vector. When no exactly matched diagnostic code is found based on the direct search using the TCW matrix, the SeaProSel system will search the vehicle fault diagnostic structure for the proper questions to pose to the user in order to obtain more details about the problem. A LSI (Latent Semantic Indexing) model is also presented and analyzed in the paper. The performances of the LSI model and TCW models are presented and discussed. An in-depth study of different term weight functions and their performances are presented. All experiments are conducted on real-world vehicle diagnostic data, and the results show that the proposed SeaProSel system generates accurate results efficiently for vehicle fault diagnostics.
Published in | International Journal of Intelligent Information Systems (Volume 4, Issue 3) |
DOI | 10.11648/j.ijiis.20150403.12 |
Page(s) | 58-70 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2015. Published by Science Publishing Group |
Vehicle Fault Diagnostics, Text Data Mining, Machine Learning, Vehicle Diagnostic Engineering Structure, TCW, LSI
[1] | Fang, J., Guo, L., Wang, X. D., & Yang, N. 2007. Ontology-Based Automatic Classification and Ranking for Web Documents. Fourth International Conference on Fuzzy Systems and Knowledge Discovery -FSKD , 2007. |
[2] | Zhuang, F. Z.; Luo, P.; Shen, Z. Y.; He, Q.; Xiong, Y. H.; Shi, Z. Z. & Xiong, H. 2012. Mining Distinction and Commonality across Multiple Domains Using Generative Model for Text Classification. IEEE Transactions on Knowledge and Data Engineering, Volume: 24 , Issue: 11, Page(s): 2025 – 2039, 2012. |
[3] | Huang, Y.H., Seliya, N., Murphey, Y. L., & Friedenthal, R. B. 2010. Classifying Independent Medical Examination Reports using SOM networks. Proceeding of the 6th International conference on Data Mining, Las Vegas, Nevada, USA, 2010, p58-64. |
[4] | Mencıa, E. L., Park, S. H., & Fürnkranz, J. 2010. Efficient voting prediction for pairwise multilabel classification. Neuro computing 73 pp.1164–1176, 2010. |
[5] | Zeng, Q.; Zhang, X.; Zhang, W.; Li, Z. & Liu, L. 2010. Extracting Clinical Information from Free-text of Pathology and Operation Notes via Chinese Natural Language Processing. 2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops, pp 593-597, Hong Kong, 2010. |
[6] | Huang, Y. H., Murphey, Y. L., & Ge, Y. 2013. Automotive diagnosis typo correction using domain knowledge and machine learning. IEEE Symposium Series on Computational Intelligence, 2013. |
[7] | Creecy, R.M., Masand, B. M., Smith, S. J., and Waltz, D. L. 1992. Trading MIPS and memory for knowledge engineering: classifying census returns on the Connection Machine, Communications of the ACM, 35(8): p. 48—63, 1992. |
[8] | Sebastiani, F. 2002. Machine learning in automated text categorization. ACM Computing Surveys, 2002. 34(1): p. 1-47. |
[9] | Yang, Y. & Liu, X. 1999. A re-examination of text categorization methods. Proc. 22th ACM Int. Conf. on Research and Development in Information Retrieval (SIGIR'99). 1999. Berkeley, CA. |
[10] | Masand, B., Linoff, G., & Waltz, D. 1992. Classifying news stories using memory based reasoning. Development in Information Retrieval, 1992: ACM Press, New York, US. |
[11] | Radovanović, M. & Ivanović, M. 2008. Text mining: approaches and applications, Novi Sad J. Math. Vol. 38, No. 3, 2008, 227-234 |
[12] | Lu, F. & Bai, Q. Y. 2010. Refined weighted K-Nearest Neighbors algorithm for text categorization. International Conference on Intelligent Systems and Knowledge Engineering (ISKE), 2010. |
[13] | Bijalwan. V., Kumar, V., Kumari, P., & Pascual, J. 2014. KNN based Machine Learning Approach for Text and Document Mining. International Journal of Database Theory and Application, Vol. 7, No. 1, 2014, pp. 61 – 70. |
[14] | Baeza-Yates, R., Ribeiro-Neto, B., Modern Information Retrieval, 1999: Addison Wesley. |
[15] | Syu, I., Lang, S.D. & Deo, N.; 1996. Incorporating latent semantic indexing into a neural network model for information retrieval. Proceedings of the fifth international conference on Information and knowledge management, 1996. |
[16] | Chen. Z.H., Ni, C. W. and Murphey, Y. L., 2006. Neural Network Approaches for Text Document Categorization. IEEE International Joint Conference on Neural Networks, July, 2006. |
[17] | Zhang, M.L. and Zhou, Z. H. 2006. Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization. IEEE Transaction in Knowledge and Data Engineering, Vol. 18, Issue 10, Oct. 2006. |
[18] | Cho, S.B. and Lee, J. H., 2003. Learning Neural Network Ensemble for Practical Text Classification. Lecture Notes in Computer Science, Volume 2690, Pages 1032– 1036, 2003. |
[19] | Yu, B.; Xu, Z. B. & Li, C. H. 2008. Latent semantic analysis for text categorization using neural network. Knowledge-Based Systems, 21- pp. 900–904, 2008 |
[20] | Thi, H. N. T.; Huu, O. N. & Ngoc, T. N. T.;2013. A supervised learning method combine with dimensionality reduction in Vietnamese text summarization. IEEE Computing, Communications and IT Applications Conference (ComComAp), 2013. |
[21] | Vinodhini, G. & Chandrasekaran, R.M.; 2014. Sentiment classification using principal component analysis based neural network model. 2014 International Conference on Information Communication and Embedded Systems (ICICES), 2014 |
[22] | Li, C. H. and Park, S. C., 2009. An efficient document classification model using an improved back propagation neural network and singular value decomposition. Expert Systems with Applications, 36, pp- 3208–3215, 2009. |
[23] | Kohonen, T. 1990. The self-organizing map. Proc. of the IEEE, 9, 1464-1479, 1990. |
[24] | Manomaisupat, P., and Abmad k. Feature Selection for text Categorization Using Self Orgnizing Map. 2nd International Conference on Neural Network and Brain, 2005, IEEE press Vol 3, pp.1875-1880, 2005. |
[25] | Liu, Y.C.; Wang, X.L.; & Wu, C.; 2008. ConSOM: A conceptional self-organizing map model for text clustering. Neurocomputing, 71(4-6), 857-862, 2008. |
[26] | Liu, Y.C., Wu, C., & Liu, M. 2011. Research of fast SOM clustering for text information. Expert Systems with Applications, 38(8), 9325-9333, 2011. |
[27] | Lewis, D.D. 1998. Naive (Bayes) at forty:The independence assumption in information retrieval. Proceedings of ECML-98. Springer Verlag, Heidelberg, 1998. |
[28] | Friedman, N.; Geiger, D.; Goldszmidt. M.; 1997. Bayesian Network Classifiers. Machine Learning, November 1997, Volume 29, Issue 2-3, pp 131-163. |
[29] | Theodoridis, S.; 2015. Machine Learning: A Bayesian and Optimization Perspective. Academic Press, 2015. |
[30] | Vapnik, V.; 1995. The Nature of Statistical Learning Theory. Springer Verlag, New York, 1995. |
[31] | Mukkamala, S., Janoski, G., Sung, A H.. 2002. Intrusion Detection Using Neural Networks and Support Vector Machines. Proceedings of IEEE International Joint Conference on Neural Networks, IEEE Computer Society Press, pp.1702-1707. |
[32] | Murphey, Y.L.; Chen, Z.H.; Putrus, M. & Feldkamp, L.A. 2003. SVM learning from large training data set. IEEE International Joint Conference on Neural Networks, July, 2003. |
[33] | Hong, H.B.; Murphey, Y.L.; Gutchess, D. & Chang, T.S. 2005. Identifying knowledge domain and incremental new class learning in SVM. IEEE International Joint Conference on Neural Networks, July, 2005. |
[34] | Chapelle, O. & Vapnik, V. 2000. Model selection for support vector machines. In S.A. Solla, T.K. Leen, and K.R. Muller, editors, Advances in Neural Information Processing Systems, volume 12. MIT Press, Cambridge, MA, 2000. |
[35] | Zhang, W.; Yoshida, T.; & Tang, X. 2008. Text Classification based on Multi-word with Support Vector Machine. Knowledge-Based Systems, vol. 12, 2008. |
[36] | Feinerer, I. & Karatzoglou, A., 2010. Support Vector Machines for Large Scale Text Mining in R. 19th International Conference on Computational Statistics, 2010. |
[37] | Hsu, Chih-Wei and Lin, Chih-Jen, 2002. A Comparison of Methods for Multiclass Support Vector Machines. IEEE Transactions On Neural Networks, VOL. 13, NO. 2, MARCH 2002. |
[38] | Platt, J. C., Cristianini, N., and Shawe-Taylor, J., 2000. Large margin DAG’s for multiclass classification. Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, vol. 12, pp. 547–553, 2000. |
[39] | Huang, L.P. 2006. Intelligent Systems for text categorization and retrieval. M.S. Thesis, Department of Electrical and Computer Engineering, University of Michigan-Dearborn, 2006. |
[40] | Raghavan, V.V., & Wong, S.K.M. 1986. A Critical Analysis of Vector Space Model for Information Retrieval. Journal of the America Society for Information Science, 1986. 37(5): 279-287. |
[41] | Porter, M.F. 1997. An algorithm for suffix stripping. Readings in Information Retrieval, 1997. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA. |
[42] | Dumais, S.T., 1991. Improving the retrieval of information from external sources. Behavior Research Methods, Instruments and Computers, 1991. 23(2): p. 229-236. |
[43] | Dumais, S.T., 1990. Enhancing performance in latent semantic indexing (LSI) retrieval. Technical Report Technical Memorandum, Bellcore, 1990. |
[44] | Dumais, S.T., Furnas, G. W., Landauer, T. K. and Deerwester, S. 1988. Using latent semantic analysis to improve information retrieval,. In Proceedings of CHI'88: Conference on Human Factors in Computing. 1988. New York: ACM. |
[45] | Jessup, E. R., & Martin, J.H., 2001. Taking a new look at the latent semantic analysis approach to information retrieval. Computational information retrieval, 2001: p. 121-144. |
[46] | Sebastiani, F. & Ricerche, C. N., 2002. Machine learning in automated text categorization. Journal of ACM Computing Surveys, Volume 34, Issue 1, March 2002. |
APA Style
Yi Lu Murphey, Liping Huang, Hao Xing Wang, Yinghao Huang. (2015). Vehicle Fault Diagnostics Using Text Mining, Vehicle Engineering Structure and Machine Learning. International Journal of Intelligent Information Systems, 4(3), 58-70. https://doi.org/10.11648/j.ijiis.20150403.12
ACS Style
Yi Lu Murphey; Liping Huang; Hao Xing Wang; Yinghao Huang. Vehicle Fault Diagnostics Using Text Mining, Vehicle Engineering Structure and Machine Learning. Int. J. Intell. Inf. Syst. 2015, 4(3), 58-70. doi: 10.11648/j.ijiis.20150403.12
AMA Style
Yi Lu Murphey, Liping Huang, Hao Xing Wang, Yinghao Huang. Vehicle Fault Diagnostics Using Text Mining, Vehicle Engineering Structure and Machine Learning. Int J Intell Inf Syst. 2015;4(3):58-70. doi: 10.11648/j.ijiis.20150403.12
@article{10.11648/j.ijiis.20150403.12, author = {Yi Lu Murphey and Liping Huang and Hao Xing Wang and Yinghao Huang}, title = {Vehicle Fault Diagnostics Using Text Mining, Vehicle Engineering Structure and Machine Learning}, journal = {International Journal of Intelligent Information Systems}, volume = {4}, number = {3}, pages = {58-70}, doi = {10.11648/j.ijiis.20150403.12}, url = {https://doi.org/10.11648/j.ijiis.20150403.12}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijiis.20150403.12}, abstract = {This paper presents an intelligent vehicle fault diagnostics system, SeaProSel(Search-Prompt-Select). SeaProSel takes a casual description of vehicle problems as input and searches for a diagnostic code that accurately matches the problem description. SeaProSel was developed using automatic text classification and machine learning techniques combined with a prompt-and-select technique based on the vehicle diagnostic engineering structure to provide robust classification of the diagnostic code that accurately matches the problem description. Machine learning algorithms are developed to automatically learn words and terms, and their variations commonly used in verbal descriptions of vehicle problems, and to build a TCW(Term-Code-Weight) matrix that is used for measuring similarity between a document vector and a diagnostic code class vector. When no exactly matched diagnostic code is found based on the direct search using the TCW matrix, the SeaProSel system will search the vehicle fault diagnostic structure for the proper questions to pose to the user in order to obtain more details about the problem. A LSI (Latent Semantic Indexing) model is also presented and analyzed in the paper. The performances of the LSI model and TCW models are presented and discussed. An in-depth study of different term weight functions and their performances are presented. All experiments are conducted on real-world vehicle diagnostic data, and the results show that the proposed SeaProSel system generates accurate results efficiently for vehicle fault diagnostics.}, year = {2015} }
TY - JOUR T1 - Vehicle Fault Diagnostics Using Text Mining, Vehicle Engineering Structure and Machine Learning AU - Yi Lu Murphey AU - Liping Huang AU - Hao Xing Wang AU - Yinghao Huang Y1 - 2015/07/09 PY - 2015 N1 - https://doi.org/10.11648/j.ijiis.20150403.12 DO - 10.11648/j.ijiis.20150403.12 T2 - International Journal of Intelligent Information Systems JF - International Journal of Intelligent Information Systems JO - International Journal of Intelligent Information Systems SP - 58 EP - 70 PB - Science Publishing Group SN - 2328-7683 UR - https://doi.org/10.11648/j.ijiis.20150403.12 AB - This paper presents an intelligent vehicle fault diagnostics system, SeaProSel(Search-Prompt-Select). SeaProSel takes a casual description of vehicle problems as input and searches for a diagnostic code that accurately matches the problem description. SeaProSel was developed using automatic text classification and machine learning techniques combined with a prompt-and-select technique based on the vehicle diagnostic engineering structure to provide robust classification of the diagnostic code that accurately matches the problem description. Machine learning algorithms are developed to automatically learn words and terms, and their variations commonly used in verbal descriptions of vehicle problems, and to build a TCW(Term-Code-Weight) matrix that is used for measuring similarity between a document vector and a diagnostic code class vector. When no exactly matched diagnostic code is found based on the direct search using the TCW matrix, the SeaProSel system will search the vehicle fault diagnostic structure for the proper questions to pose to the user in order to obtain more details about the problem. A LSI (Latent Semantic Indexing) model is also presented and analyzed in the paper. The performances of the LSI model and TCW models are presented and discussed. An in-depth study of different term weight functions and their performances are presented. All experiments are conducted on real-world vehicle diagnostic data, and the results show that the proposed SeaProSel system generates accurate results efficiently for vehicle fault diagnostics. VL - 4 IS - 3 ER -