Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications
Artificial Intelligence (AI) tools such as DeepSeek R1 and ChatGPT 4.5 have emerged as promising aids in Arabic-English literary translation. This study aims to compare the translation performance of these two systems using a mixed-methods approach. Quantitative analysis was conducted through five e...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Taylor & Francis Group
2025-12-01
|
| Series: | Cogent Arts & Humanities |
| Subjects: | |
| Online Access: | https://www.tandfonline.com/doi/10.1080/23311983.2025.2531183 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850101625522225152 |
|---|---|
| author | Rachid Ed-Dali |
| author_facet | Rachid Ed-Dali |
| author_sort | Rachid Ed-Dali |
| collection | DOAJ |
| description | Artificial Intelligence (AI) tools such as DeepSeek R1 and ChatGPT 4.5 have emerged as promising aids in Arabic-English literary translation. This study aims to compare the translation performance of these two systems using a mixed-methods approach. Quantitative analysis was conducted through five established evaluation metrics, BLEU, COMET, METEOR, BERTScore, and sacreBLEU, to assess accuracy, fluency, and semantic coherence. Complementing these measures, a qualitative evaluation was carried out with 80 undergraduate students from Cadi Ayyad University, who critically assessed anonymized AI-generated translations against their versions and published human translations using a structured rubric. Results indicate that DeepSeek R1 achieved consistently higher automated metric scores across literary genres (novels, plays, poems). However, qualitative analysis highlighted persistent challenges in pragmatic coherence, cohesion, and emotional depth, especially in poetry and dramatic texts. While DeepSeek R1 demonstrated potential in lexical accuracy and fluency, significant human intervention remains essential for achieving high-quality literary translations. Future research should integrate larger-scale human evaluations to comprehensively assess the capabilities and limitations of AI translation tools in diverse literary contexts. |
| format | Article |
| id | doaj-art-e6a5fe2e6069449ea708b1aecc4d521b |
| institution | DOAJ |
| issn | 2331-1983 |
| language | English |
| publishDate | 2025-12-01 |
| publisher | Taylor & Francis Group |
| record_format | Article |
| series | Cogent Arts & Humanities |
| spelling | doaj-art-e6a5fe2e6069449ea708b1aecc4d521b2025-08-20T02:39:58ZengTaylor & Francis GroupCogent Arts & Humanities2331-19832025-12-0112110.1080/23311983.2025.2531183Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implicationsRachid Ed-Dali0Department of English Studies, Faculty of Letters and Human Sciences, Cadi Ayyad University, Marrakech, MoroccoArtificial Intelligence (AI) tools such as DeepSeek R1 and ChatGPT 4.5 have emerged as promising aids in Arabic-English literary translation. This study aims to compare the translation performance of these two systems using a mixed-methods approach. Quantitative analysis was conducted through five established evaluation metrics, BLEU, COMET, METEOR, BERTScore, and sacreBLEU, to assess accuracy, fluency, and semantic coherence. Complementing these measures, a qualitative evaluation was carried out with 80 undergraduate students from Cadi Ayyad University, who critically assessed anonymized AI-generated translations against their versions and published human translations using a structured rubric. Results indicate that DeepSeek R1 achieved consistently higher automated metric scores across literary genres (novels, plays, poems). However, qualitative analysis highlighted persistent challenges in pragmatic coherence, cohesion, and emotional depth, especially in poetry and dramatic texts. While DeepSeek R1 demonstrated potential in lexical accuracy and fluency, significant human intervention remains essential for achieving high-quality literary translations. Future research should integrate larger-scale human evaluations to comprehensively assess the capabilities and limitations of AI translation tools in diverse literary contexts.https://www.tandfonline.com/doi/10.1080/23311983.2025.2531183AI-assisted literary translationArabic–English translationtranslation quality evaluationpragmatic coherenceChatGPT4.5DeepSeek R1 |
| spellingShingle | Rachid Ed-Dali Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications Cogent Arts & Humanities AI-assisted literary translation Arabic–English translation translation quality evaluation pragmatic coherence ChatGPT4.5 DeepSeek R1 |
| title | Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications |
| title_full | Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications |
| title_fullStr | Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications |
| title_full_unstemmed | Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications |
| title_short | Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications |
| title_sort | assessing deepseek r1 and chatgpt 4 5 in arabic english literary translation performance challenges and implications |
| topic | AI-assisted literary translation Arabic–English translation translation quality evaluation pragmatic coherence ChatGPT4.5 DeepSeek R1 |
| url | https://www.tandfonline.com/doi/10.1080/23311983.2025.2531183 |
| work_keys_str_mv | AT rachideddali assessingdeepseekr1andchatgpt45inarabicenglishliterarytranslationperformancechallengesandimplications |