Comparing Large Language Models and Human Programmers for Generating Programming Code

Abstract The performance of seven large language models (LLMs) in generating programming code using various prompt strategies, programming languages, and task difficulties is systematically evaluated. GPT‐4 substantially outperforms other LLMs, including Gemini Ultra and Claude 2. The coding perform...

Full description

Saved in:
Bibliographic Details
Main Authors: Wenpin Hou, Zhicheng Ji
Format: Article
Language:English
Published: Wiley 2025-02-01
Series:Advanced Science
Subjects:
Online Access:https://doi.org/10.1002/advs.202412279
Tags: Add Tag
No Tags, Be the first to tag this record!