MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models

Ensuring the general efficacy and benefit for human beings from medical Large Language Models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we in...

Full description

Saved in:
Bibliographic Details
Main Authors: Mianxin Liu, Weiguo Hu, Jinru Ding, Jie Xu, Xiaoyang Li, Lifeng Zhu, Zhian Bai, Xiaoming Shi, Benyou Wang, Haitao Song, Pengfei Liu, Xiaofan Zhang, Shanshan Wang, Kang Li, Haofen Wang, Tong Ruan, Xuanjing Huang, Xin Sun, Shaoting Zhang
Format: Article
Language:English
Published: Tsinghua University Press 2024-12-01
Series:Big Data Mining and Analytics
Subjects:
Online Access:https://www.sciopen.com/article/10.26599/BDMA.2024.9020044
Tags: Add Tag
No Tags, Be the first to tag this record!