MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models
Ensuring the general efficacy and benefit for human beings from medical Large Language Models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we in...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Tsinghua University Press
2024-12-01
|
| Series: | Big Data Mining and Analytics |
| Subjects: | |
| Online Access: | https://www.sciopen.com/article/10.26599/BDMA.2024.9020044 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!