Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications

Artificial Intelligence (AI) tools such as DeepSeek R1 and ChatGPT 4.5 have emerged as promising aids in Arabic-English literary translation. This study aims to compare the translation performance of these two systems using a mixed-methods approach. Quantitative analysis was conducted through five e...

Full description

Saved in:

Bibliographic Details
Main Author:	Rachid Ed-Dali
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2025-12-01
Series:	Cogent Arts & Humanities
Subjects:	AI-assisted literary translation Arabic–English translation translation quality evaluation pragmatic coherence ChatGPT4.5 DeepSeek R1
Online Access:	https://www.tandfonline.com/doi/10.1080/23311983.2025.2531183
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850101625522225152
author	Rachid Ed-Dali
author_facet	Rachid Ed-Dali
author_sort	Rachid Ed-Dali
collection	DOAJ
description	Artificial Intelligence (AI) tools such as DeepSeek R1 and ChatGPT 4.5 have emerged as promising aids in Arabic-English literary translation. This study aims to compare the translation performance of these two systems using a mixed-methods approach. Quantitative analysis was conducted through five established evaluation metrics, BLEU, COMET, METEOR, BERTScore, and sacreBLEU, to assess accuracy, fluency, and semantic coherence. Complementing these measures, a qualitative evaluation was carried out with 80 undergraduate students from Cadi Ayyad University, who critically assessed anonymized AI-generated translations against their versions and published human translations using a structured rubric. Results indicate that DeepSeek R1 achieved consistently higher automated metric scores across literary genres (novels, plays, poems). However, qualitative analysis highlighted persistent challenges in pragmatic coherence, cohesion, and emotional depth, especially in poetry and dramatic texts. While DeepSeek R1 demonstrated potential in lexical accuracy and fluency, significant human intervention remains essential for achieving high-quality literary translations. Future research should integrate larger-scale human evaluations to comprehensively assess the capabilities and limitations of AI translation tools in diverse literary contexts.
format	Article
id	doaj-art-e6a5fe2e6069449ea708b1aecc4d521b
institution	DOAJ
issn	2331-1983
language	English
publishDate	2025-12-01
publisher	Taylor & Francis Group
record_format	Article
series	Cogent Arts & Humanities
spelling	doaj-art-e6a5fe2e6069449ea708b1aecc4d521b2025-08-20T02:39:58ZengTaylor & Francis GroupCogent Arts & Humanities2331-19832025-12-0112110.1080/23311983.2025.2531183Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implicationsRachid Ed-Dali0Department of English Studies, Faculty of Letters and Human Sciences, Cadi Ayyad University, Marrakech, MoroccoArtificial Intelligence (AI) tools such as DeepSeek R1 and ChatGPT 4.5 have emerged as promising aids in Arabic-English literary translation. This study aims to compare the translation performance of these two systems using a mixed-methods approach. Quantitative analysis was conducted through five established evaluation metrics, BLEU, COMET, METEOR, BERTScore, and sacreBLEU, to assess accuracy, fluency, and semantic coherence. Complementing these measures, a qualitative evaluation was carried out with 80 undergraduate students from Cadi Ayyad University, who critically assessed anonymized AI-generated translations against their versions and published human translations using a structured rubric. Results indicate that DeepSeek R1 achieved consistently higher automated metric scores across literary genres (novels, plays, poems). However, qualitative analysis highlighted persistent challenges in pragmatic coherence, cohesion, and emotional depth, especially in poetry and dramatic texts. While DeepSeek R1 demonstrated potential in lexical accuracy and fluency, significant human intervention remains essential for achieving high-quality literary translations. Future research should integrate larger-scale human evaluations to comprehensively assess the capabilities and limitations of AI translation tools in diverse literary contexts.https://www.tandfonline.com/doi/10.1080/23311983.2025.2531183AI-assisted literary translationArabic–English translationtranslation quality evaluationpragmatic coherenceChatGPT4.5DeepSeek R1
spellingShingle	Rachid Ed-Dali Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications Cogent Arts & Humanities AI-assisted literary translation Arabic–English translation translation quality evaluation pragmatic coherence ChatGPT4.5 DeepSeek R1
title	Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
title_full	Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
title_fullStr	Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
title_full_unstemmed	Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
title_short	Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
title_sort	assessing deepseek r1 and chatgpt 4 5 in arabic english literary translation performance challenges and implications
topic	AI-assisted literary translation Arabic–English translation translation quality evaluation pragmatic coherence ChatGPT4.5 DeepSeek R1
url	https://www.tandfonline.com/doi/10.1080/23311983.2025.2531183
work_keys_str_mv	AT rachideddali assessingdeepseekr1andchatgpt45inarabicenglishliterarytranslationperformancechallengesandimplications

Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications

Similar Items