Text this: Large language models underperform in European general surgery board examinations: a comparative study with experts and surgical residents