Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications

Artificial Intelligence (AI) tools such as DeepSeek R1 and ChatGPT 4.5 have emerged as promising aids in Arabic-English literary translation. This study aims to compare the translation performance of these two systems using a mixed-methods approach. Quantitative analysis was conducted through five e...

Full description

Saved in:
Bibliographic Details
Main Author: Rachid Ed-Dali
Format: Article
Language:English
Published: Taylor & Francis Group 2025-12-01
Series:Cogent Arts & Humanities
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/23311983.2025.2531183
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850101625522225152
author Rachid Ed-Dali
author_facet Rachid Ed-Dali
author_sort Rachid Ed-Dali
collection DOAJ
description Artificial Intelligence (AI) tools such as DeepSeek R1 and ChatGPT 4.5 have emerged as promising aids in Arabic-English literary translation. This study aims to compare the translation performance of these two systems using a mixed-methods approach. Quantitative analysis was conducted through five established evaluation metrics, BLEU, COMET, METEOR, BERTScore, and sacreBLEU, to assess accuracy, fluency, and semantic coherence. Complementing these measures, a qualitative evaluation was carried out with 80 undergraduate students from Cadi Ayyad University, who critically assessed anonymized AI-generated translations against their versions and published human translations using a structured rubric. Results indicate that DeepSeek R1 achieved consistently higher automated metric scores across literary genres (novels, plays, poems). However, qualitative analysis highlighted persistent challenges in pragmatic coherence, cohesion, and emotional depth, especially in poetry and dramatic texts. While DeepSeek R1 demonstrated potential in lexical accuracy and fluency, significant human intervention remains essential for achieving high-quality literary translations. Future research should integrate larger-scale human evaluations to comprehensively assess the capabilities and limitations of AI translation tools in diverse literary contexts.
format Article
id doaj-art-e6a5fe2e6069449ea708b1aecc4d521b
institution DOAJ
issn 2331-1983
language English
publishDate 2025-12-01
publisher Taylor & Francis Group
record_format Article
series Cogent Arts & Humanities
spelling doaj-art-e6a5fe2e6069449ea708b1aecc4d521b2025-08-20T02:39:58ZengTaylor & Francis GroupCogent Arts & Humanities2331-19832025-12-0112110.1080/23311983.2025.2531183Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implicationsRachid Ed-Dali0Department of English Studies, Faculty of Letters and Human Sciences, Cadi Ayyad University, Marrakech, MoroccoArtificial Intelligence (AI) tools such as DeepSeek R1 and ChatGPT 4.5 have emerged as promising aids in Arabic-English literary translation. This study aims to compare the translation performance of these two systems using a mixed-methods approach. Quantitative analysis was conducted through five established evaluation metrics, BLEU, COMET, METEOR, BERTScore, and sacreBLEU, to assess accuracy, fluency, and semantic coherence. Complementing these measures, a qualitative evaluation was carried out with 80 undergraduate students from Cadi Ayyad University, who critically assessed anonymized AI-generated translations against their versions and published human translations using a structured rubric. Results indicate that DeepSeek R1 achieved consistently higher automated metric scores across literary genres (novels, plays, poems). However, qualitative analysis highlighted persistent challenges in pragmatic coherence, cohesion, and emotional depth, especially in poetry and dramatic texts. While DeepSeek R1 demonstrated potential in lexical accuracy and fluency, significant human intervention remains essential for achieving high-quality literary translations. Future research should integrate larger-scale human evaluations to comprehensively assess the capabilities and limitations of AI translation tools in diverse literary contexts.https://www.tandfonline.com/doi/10.1080/23311983.2025.2531183AI-assisted literary translationArabic–English translationtranslation quality evaluationpragmatic coherenceChatGPT4.5DeepSeek R1
spellingShingle Rachid Ed-Dali
Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
Cogent Arts & Humanities
AI-assisted literary translation
Arabic–English translation
translation quality evaluation
pragmatic coherence
ChatGPT4.5
DeepSeek R1
title Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
title_full Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
title_fullStr Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
title_full_unstemmed Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
title_short Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
title_sort assessing deepseek r1 and chatgpt 4 5 in arabic english literary translation performance challenges and implications
topic AI-assisted literary translation
Arabic–English translation
translation quality evaluation
pragmatic coherence
ChatGPT4.5
DeepSeek R1
url https://www.tandfonline.com/doi/10.1080/23311983.2025.2531183
work_keys_str_mv AT rachideddali assessingdeepseekr1andchatgpt45inarabicenglishliterarytranslationperformancechallengesandimplications