MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models

Ensuring the general efficacy and benefit for human beings from medical Large Language Models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we in...

Full description

Saved in:

Bibliographic Details
Main Authors:	Mianxin Liu, Weiguo Hu, Jinru Ding, Jie Xu, Xiaoyang Li, Lifeng Zhu, Zhian Bai, Xiaoming Shi, Benyou Wang, Haitao Song, Pengfei Liu, Xiaofan Zhang, Shanshan Wang, Kang Li, Haofen Wang, Tong Ruan, Xuanjing Huang, Xin Sun, Shaoting Zhang
Format:	Article
Language:	English
Published:	Tsinghua University Press 2024-12-01
Series:	Big Data Mining and Analytics
Subjects:	medical large language model (mllm) benchmark platform open-source
Online Access:	https://www.sciopen.com/article/10.26599/BDMA.2024.9020044
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://www.sciopen.com/article/10.26599/BDMA.2024.9020044

MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models

Internet

Similar Items