Research Article | | Peer-Reviewed

A Comparative Study on the Translation Quality of Chinese Diplomatic Discourse by NMT and LLMs Based on Multidimensional Quality Metrics

Received: 26 September 2025     Accepted: 10 October 2025     Published: 27 October 2025
Views:       Downloads:
Abstract

Chinese diplomatic discourse plays a crucial role in articulating China’s position and enhancing its influence in global forums. However, machine translation (MT) often struggles with culturally nuanced and abstract expressions, highlighting the need to compare various advanced MT tools. This study assesses and compares the translation quality of Neural Machine Translation (NMT) systems and Large Language Models (LLMs) in translating Chinese diplomatic texts, focusing on the 2025 China-US tariff statements by China’s Foreign Ministry Spokesperson Lin Jian, with China Daily’s official English versions serving as references. Four NMT tools (Niutrans, Youdao, Google, DeepL) and four LLMs (DeepSeek, Ernie-4.5, ChatGPT-4.0, Gemini) were examined. Using the Multidimensional Quality Metrics (MQM) framework, the study evaluated translations, especially for phrases like “奉陪到底” (fight to the end) and “得道多助,失道寡助” (A just cause enjoys abundant support while an unjust one finds little). Results show that LLMs outperform NMTs: 50% of LLMs (DeepSeek, Ernie-4.5) accurately translated both phrases, while only 25% of NMTs (Google) did so for “奉陪到底,” and none for “得道多助,失道寡助.” Both systems faced issues such as undertranslation, omission, and a lack of diplomatic formality. The findings suggest that LLMs have greater potential to handle cultural nuances and abstract content in diplomatic texts, providing insights for enhancing domain-specific MT training and striking a balance between accuracy and acceptability in conveying Chinese diplomatic messages.

Published in International Journal of Applied Linguistics and Translation (Volume 11, Issue 4)
DOI 10.11648/j.ijalt.20251104.12
Page(s) 107-115
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2025. Published by Science Publishing Group

Keywords

Neural Machine Translation (NMT), Large Language Models (LLMs), Translation of Chinese Diplomatic Discourse, Translation Quality Assessment

References
[1] Semenov, Alexander & Tsvyk, Anatoly. (2021). The approach to the Chinese diplomatic discourse. Fudan Journal of the Humanities and Social Sciences. 14. 1-22.
[2] Liu, Mingze & Yan, Jiale & Yao, Guangyuan. (2023). Themes and ideologies in China’s diplomatic discourse – a corpus-assisted discourse analysis in China’s official speeches. Frontiers in Psychology. 14.
[3] China Daily. China urges US to reverse tariffs, preserve counternarcotics cooperation. Available from:
[4] China Daily. China plans to add tariffs on US products. Available from:
[5] China Daily. China-US trade talks make substantial progress. Available from:
[6] Guo, Zezhang & Shen, Shu. A corpus-based study on metaphorical modes of China’s diplomatic discourse and corresponding French translation strategies: Taking the speeches made at the regular press conferences of the Ministry of Foreign Affairs from 2020 to 2022 as an example. Advances in Education, Humanities and Social Science Research. 2024, 12(1), 452-462.
[7] Li, Tao & Xu, Fang. Re-appraising self and other in the English translation of contemporary Chinese political discourse. Discourse, Context & Media. 2018, 25(6), 1-8.
[8] Tekwa, Kizito & Mei, Li. (2022). Translation, politics, and development: A corpus-based approach to evaluating China’s development aid discourse. Linguistica Antverpiensia New Series - Themes in Translation Studies. 21.
[9] Zhang, Chenxia & Afzaal, Muhammad & Omar, Abdulfattah & Altohami, Waheed. (2023). A corpus-based analysis of the stylistic features of Chinese and American diplomatic discourse. Frontiers in Psychology. 14.
[10] Fu, Rongbo. (2016). Comparing modal patterns in Chinese-English interpreted and translated discourses in diplomatic setting: A systemic functional approach. Babel. 62. 104-121.
[11] Hu, Kaibao & Li, Xiaoqian. (2022). The image of the Chinese government in the English translations of Report on the Work of the Government: A corpus-based study. Asia Pacific Translation and Intercultural Studies. 9. 1-20.
[12] Liu, Yangyang. A study on language conversion and construction of discourse power in Chinese diplomacy. International Journal of Linguistics, Literature and Translation. 2024, 7(4), 85-91.
[13] Yu, Hailing & Wu, Canzhong. Functions of the pronoun ‘we’ in the English translations of Chinese government reports. Advances in Discourse Analysis of Translation and Interpreting (pp.85-105), 1st Edition. London: Routledge; 2020, 1-240.
[14] Chang, Jiang & Ying, Luo. (2024). A Contrastive Study of the Translator’s Behaviour in English and Spanish Translations of Metaphors in Xi Jinping: The Governance of China. Sinología hispánica. China Studies Review. 17. 113-138.
[15] Xu, Dong & Abdou Moindjie, Mohamed & Mehar Singh, Manjet Kaur. (2024). Assessing narratives in the translation of Chinese political discourse: A perspective from the narrative paradigm. International Journal of English Linguistics. 14. 62-62.
[16] Aina, Sun & Chwee Fang, Ng & Subramanlam, Vijayaletchumy & Ghani, C. Chinese-to-English translation of political discourse: A feature-oriented analysis. International Journal of Materials Science and Applications. 2022, 13(2), 205-213.
[17] Huang, Mengyan & Xie, Zenan. (2025). Translation Strategies of Tautology in Chinese Political Discourse - A Case Study of Xi Jinping: The Governance of China (Volume III). Stallion Journal for Multidisciplinary Associated Research Studies. 4. 1-7.
[18] Wang, Yizhe & Ruan, Hongmei. Study of Chinese political terminology translation and national image shaping. International Journal of Languages, Literature and Linguistics. 2023, 9(5), 378-384.
[19] Li, Tao & Pan, Feng. (2020). Reshaping China’s image: A corpus-based analysis of the English translation of Chinese political discourse. Perspectives. 29. 1-17.
[20] Xu, Dong & Abdou Moindjie, Mohamed & Mehar Singh, Manjet Kaur. Framing narratives in the translation of Chinese political discourse: Case examples from The Governance of China. English Language and Literature Studies. 2024, 14(2), 1-12.
[21] Tian, Xujun. Translators as mediators to mend the psychological gap between source text and target text: A corpus-based study on the Chinese English translation of modal verbs in the Chinese Report on the Work of the Government (2000–2022). PLOS ONE. 2025, 20(3), 1-14.
[22] Lingqian, Zheng & Ren, Wen. Interpreting as an influencing factor on news reports: A study of interpreted Chinese political discourse recontextualized in English news. Perspectives: Studies in Translatology. 2018, 26(5), 691-707.
[23] Gu, James Chonglong & Wang, Binhua. (2021). Interpreter-mediated discourse as a vital source of meaning potential in intercultural communication: The case of the interpreted premier-meets-the-press conferences in China. Language and Intercultural Communication. 21. 1-16.
[24] Pan, Li & Huang, Chuxin. (2020). Stance mediation in media translation of political speeches. In book: Advances in Discourse Analysis of Translation and Interpreting: Linking Linguistic Approaches with Socio-cultural Interpretation (pp.131-149) Chapter: 7 Publisher: Routledge.
[25] Zhang, Chenxia. (2025). When translation meets dissemination: Translations of the Chinese diplomatic term Mìngyùn Gòngtóngtǐ in English news reports. Language Sciences. 110. 101727.
[26] Ping, Yuan. Quoting Chinese Political Discourse through Translation: An Analysis of Xi Jinping’s Climate Change Discourse in English-language News Media. International Journal of Chinese and English Translation & Interpreting. 2023(3), 1-17.
[27] Xin J, Matheson D. One Belt, competing metaphors: The struggle over strategic narrative in English-language news media [J]. International Journal of Communication, 2018, 12: 21.
[28] Zhao, Jiaming & Wang, Jiayin. Discursive practices in translating political discourse: Insights from white papers on China-US economic and trade frictions. Humanities and Social Sciences Communications. 2025, 12(1), 1-11.
[29] Stahlberg, F. (2020). Neural machine translation: A review. Journal of Artificial Intelligence Research, 69, 343-418.
[30] Dwivedi, Ritesh & Nand, Parma & Pal, Om. (2024). Hybrid NMT model and comparison with existing machine translation approaches. Multidisciplinary Science Journal. 7. 2025146.
[31] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.
[32] Jiao, Wenxiang & Wang, Wenxuan & Huang, Jen-Tse & Wang, Xing & Shi, Shuming & Tu, Zhaopeng. (2023). Is ChatGPT A Good Translator? A Preliminary Study.
[33] Khoshafah, Saleh & Tagaddeen, Ibraheem. (2023). Effectiveness of Machine Translation in Rendering Yemeni Culture-Specific Items into English: Sana'ani Dialect as a Case-in-Point. مجلة جامعة صنعاء للعلوم الإنسانية. 5.
[34] Shutova, Ekaterina. (2015). Design and Evaluation of Metaphor Processing Systems. Computational Linguistics. 41. 579-623.
[35] Lihua, Zhao. (2022). The Relationship between Machine Translation and Human Translation under the Influence of Artificial Intelligence Machine Translation. Mobile Information Systems. 2022. 1-8.
[36] Varmazyari, Hamid & Anari, Salar. (2016). House's Newly Revised Translation Quality Assessment Model in Practice: A Case Study. 13. 27-46.
[37] Li, Hanji & Chen, Haiqing. Human vs. AI: An assessment of the translation quality between translators and machine translation. International Journal of Translation, Interpretation, and Applied Linguistics. 2019, 1(1), 43-54.
[38] Thompson, Brian & Post, Matt. (2020). Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing.
[39] Lommel, Arle & Burchardt, Aljoscha & Uszkoreit, Hans. (2014). Multidimensional Quality Metrics (MQM): A Framework for Declaring and Describing Translation Quality Metrics. Tradumàtica: tecnologies de la traducció. 455-463.
[40] The MQM Council. MQM (Multidimensional Quality Metrics). Available from:
[41] Lommel, Arle & Gladkoff, Serge & Melby, Alan & Wright, Sue & Strandvik, Ingemar & Gasova, Katerina & Vaasa, Angelika & Marazzato Sparano, Romina & Faresi, Monica & Innis, Johani & Han, Lifeng & Nenadic, Goran. The Multi-Range Theory of Translation Quality Measurement: MQM Scoring Models and Statistical Quality Control. 2024.
[42] Cady, L. P., Tsou, B. K., & Lee, J. S. (2023, September). Comparing Chinese‐English MT Performance Involving ChatGPT and MT Providers and the Efficacy of AI Mediated Post‐Editing. In Machine Translation Summit XIX (MT Summit 2023) (pp. 205-216). Asia-Pacific Association for Machine Translation.
[43] Weigang, Li & Brom, Pedro. (2025). The Paradox of Poetic Intent in Back-Translation: Evaluating the Quality of Large Language Models in Chinese Translation.
[44] Hendy, Amr & Abdelrehim, Mohamed & Sharaf, Amr & Raunak, Vikas & Gabr, Mohamed & Matsushita, Hitokazu & Kim, Young Jin & Afify, Mohamed & Awadalla, Hany. (2023). How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation.
[45] Othman, Achraf & Chemnad, Khansa & Tlili, Ahmed & Da, Ting & Wang, Huanhuan & Huang, Ronghuai. (2024). Comparative analysis of GPT-4, Gemini, and Ernie as gloss sign language translators in special education. Discover Global Society. 2.
Cite This Article
  • APA Style

    Lu, D. (2025). A Comparative Study on the Translation Quality of Chinese Diplomatic Discourse by NMT and LLMs Based on Multidimensional Quality Metrics. International Journal of Applied Linguistics and Translation, 11(4), 107-115. https://doi.org/10.11648/j.ijalt.20251104.12

    Copy | Download

    ACS Style

    Lu, D. A Comparative Study on the Translation Quality of Chinese Diplomatic Discourse by NMT and LLMs Based on Multidimensional Quality Metrics. Int. J. Appl. Linguist. Transl. 2025, 11(4), 107-115. doi: 10.11648/j.ijalt.20251104.12

    Copy | Download

    AMA Style

    Lu D. A Comparative Study on the Translation Quality of Chinese Diplomatic Discourse by NMT and LLMs Based on Multidimensional Quality Metrics. Int J Appl Linguist Transl. 2025;11(4):107-115. doi: 10.11648/j.ijalt.20251104.12

    Copy | Download

  • @article{10.11648/j.ijalt.20251104.12,
      author = {Dong Lu},
      title = {A Comparative Study on the Translation Quality of Chinese Diplomatic Discourse by NMT and LLMs Based on Multidimensional Quality Metrics
    },
      journal = {International Journal of Applied Linguistics and Translation},
      volume = {11},
      number = {4},
      pages = {107-115},
      doi = {10.11648/j.ijalt.20251104.12},
      url = {https://doi.org/10.11648/j.ijalt.20251104.12},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijalt.20251104.12},
      abstract = {Chinese diplomatic discourse plays a crucial role in articulating China’s position and enhancing its influence in global forums. However, machine translation (MT) often struggles with culturally nuanced and abstract expressions, highlighting the need to compare various advanced MT tools. This study assesses and compares the translation quality of Neural Machine Translation (NMT) systems and Large Language Models (LLMs) in translating Chinese diplomatic texts, focusing on the 2025 China-US tariff statements by China’s Foreign Ministry Spokesperson Lin Jian, with China Daily’s official English versions serving as references. Four NMT tools (Niutrans, Youdao, Google, DeepL) and four LLMs (DeepSeek, Ernie-4.5, ChatGPT-4.0, Gemini) were examined. Using the Multidimensional Quality Metrics (MQM) framework, the study evaluated translations, especially for phrases like “奉陪到底” (fight to the end) and “得道多助,失道寡助” (A just cause enjoys abundant support while an unjust one finds little). Results show that LLMs outperform NMTs: 50% of LLMs (DeepSeek, Ernie-4.5) accurately translated both phrases, while only 25% of NMTs (Google) did so for “奉陪到底,” and none for “得道多助,失道寡助.” Both systems faced issues such as undertranslation, omission, and a lack of diplomatic formality. The findings suggest that LLMs have greater potential to handle cultural nuances and abstract content in diplomatic texts, providing insights for enhancing domain-specific MT training and striking a balance between accuracy and acceptability in conveying Chinese diplomatic messages.
    },
     year = {2025}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - A Comparative Study on the Translation Quality of Chinese Diplomatic Discourse by NMT and LLMs Based on Multidimensional Quality Metrics
    
    AU  - Dong Lu
    Y1  - 2025/10/27
    PY  - 2025
    N1  - https://doi.org/10.11648/j.ijalt.20251104.12
    DO  - 10.11648/j.ijalt.20251104.12
    T2  - International Journal of Applied Linguistics and Translation
    JF  - International Journal of Applied Linguistics and Translation
    JO  - International Journal of Applied Linguistics and Translation
    SP  - 107
    EP  - 115
    PB  - Science Publishing Group
    SN  - 2472-1271
    UR  - https://doi.org/10.11648/j.ijalt.20251104.12
    AB  - Chinese diplomatic discourse plays a crucial role in articulating China’s position and enhancing its influence in global forums. However, machine translation (MT) often struggles with culturally nuanced and abstract expressions, highlighting the need to compare various advanced MT tools. This study assesses and compares the translation quality of Neural Machine Translation (NMT) systems and Large Language Models (LLMs) in translating Chinese diplomatic texts, focusing on the 2025 China-US tariff statements by China’s Foreign Ministry Spokesperson Lin Jian, with China Daily’s official English versions serving as references. Four NMT tools (Niutrans, Youdao, Google, DeepL) and four LLMs (DeepSeek, Ernie-4.5, ChatGPT-4.0, Gemini) were examined. Using the Multidimensional Quality Metrics (MQM) framework, the study evaluated translations, especially for phrases like “奉陪到底” (fight to the end) and “得道多助,失道寡助” (A just cause enjoys abundant support while an unjust one finds little). Results show that LLMs outperform NMTs: 50% of LLMs (DeepSeek, Ernie-4.5) accurately translated both phrases, while only 25% of NMTs (Google) did so for “奉陪到底,” and none for “得道多助,失道寡助.” Both systems faced issues such as undertranslation, omission, and a lack of diplomatic formality. The findings suggest that LLMs have greater potential to handle cultural nuances and abstract content in diplomatic texts, providing insights for enhancing domain-specific MT training and striking a balance between accuracy and acceptability in conveying Chinese diplomatic messages.
    
    VL  - 11
    IS  - 4
    ER  - 

    Copy | Download

Author Information
  • Sections