Multi-Examiner: A Knowledge Graph-Driven System for Generating Comprehensive IT Questions with Higher-Order Thinking

The question generation system (QGS) for information technology (IT) education, designed to create, evaluate, and improve Multiple-Choice Questions (MCQs) using knowledge graphs (KGs) and large language models (LLMs), encounters three major needs: ensuring the generation of contextually relevant and...

Full description

Saved in:
Bibliographic Details
Main Authors: Yonggu Wang, Zeyu Yu, Zihan Wang, Zengyi Yu, Jue Wang
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/10/5719
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849711071576719360
author Yonggu Wang
Zeyu Yu
Zihan Wang
Zengyi Yu
Jue Wang
author_facet Yonggu Wang
Zeyu Yu
Zihan Wang
Zengyi Yu
Jue Wang
author_sort Yonggu Wang
collection DOAJ
description The question generation system (QGS) for information technology (IT) education, designed to create, evaluate, and improve Multiple-Choice Questions (MCQs) using knowledge graphs (KGs) and large language models (LLMs), encounters three major needs: ensuring the generation of contextually relevant and accurate distractors, enhancing the diversity of generated questions, and balancing the higher-order thinking of questions to match various learning levels. To address these needs, we proposed a multi-agent system named Multi-Examiner, which integrates KGs, domain-specific search tools, and local knowledge bases, categorized according to Bloom’s taxonomy, to enhance the contextual relevance, diversity, and higher-order thinking of automatically generated information technology MCQs. Our methodology employed a mixed-methods approach combining system development with experimental evaluation. We first constructed a specialized architecture combining knowledge graphs with LLMs, then implemented a comparative study generating questions across six knowledge points from K-12 Computer Science Standard. We designed a multidimensional evaluation rubric to assess the semantic coherence, answer correctness, question validity, distractor relevance, question diversity, and higher-order thinking, and conducted a statistical analysis of ratings provided by 30 high school IT teachers. Results showed statistically significant improvements (<i>p</i> < 0.01) with Multi-Examiner outperforming GPT-4 by an average of 0.87 points (on a 5-point scale) for evaluation-level questions and 1.12 points for creation-level questions. The results demonstrated that: (i) overall, questions generated by the Multi-Examiner system outperformed those generated by GPT-4 across all dimensions and closely matched the quality of human-crafted questions in several dimensions; (ii) domain-specific search tools significantly enhanced the diversity of questions generated by Multi-Examiner; and (iii) GPT-4 generated better questions for knowledge points at the “remembering” and “understanding” levels, while Multi-Examiner significantly improved the higher-order thinking of questions for the “evaluating” and “creating” levels. This study contributes to the growing body of research on AI-supported educational assessment by demonstrating how specialized knowledge structures can enhance automated generation of higher-order thinking questions beyond what general-purpose language models can achieve.
format Article
id doaj-art-9a2b24e2603944dc8a553428cc6a983e
institution DOAJ
issn 2076-3417
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-9a2b24e2603944dc8a553428cc6a983e2025-08-20T03:14:42ZengMDPI AGApplied Sciences2076-34172025-05-011510571910.3390/app15105719Multi-Examiner: A Knowledge Graph-Driven System for Generating Comprehensive IT Questions with Higher-Order ThinkingYonggu Wang0Zeyu Yu1Zihan Wang2Zengyi Yu3Jue Wang4College of Education, Zhejiang University of Technology, Hangzhou 310023, ChinaCollege of Education, Zhejiang University of Technology, Hangzhou 310023, ChinaCollege of Education, Zhejiang University of Technology, Hangzhou 310023, ChinaCollege of Education, Zhejiang University of Technology, Hangzhou 310023, ChinaFaculty of Applied Science and Engineering, University of Toronto, 35 St. George Street, Toronto, ON M5S 1A4, CanadaThe question generation system (QGS) for information technology (IT) education, designed to create, evaluate, and improve Multiple-Choice Questions (MCQs) using knowledge graphs (KGs) and large language models (LLMs), encounters three major needs: ensuring the generation of contextually relevant and accurate distractors, enhancing the diversity of generated questions, and balancing the higher-order thinking of questions to match various learning levels. To address these needs, we proposed a multi-agent system named Multi-Examiner, which integrates KGs, domain-specific search tools, and local knowledge bases, categorized according to Bloom’s taxonomy, to enhance the contextual relevance, diversity, and higher-order thinking of automatically generated information technology MCQs. Our methodology employed a mixed-methods approach combining system development with experimental evaluation. We first constructed a specialized architecture combining knowledge graphs with LLMs, then implemented a comparative study generating questions across six knowledge points from K-12 Computer Science Standard. We designed a multidimensional evaluation rubric to assess the semantic coherence, answer correctness, question validity, distractor relevance, question diversity, and higher-order thinking, and conducted a statistical analysis of ratings provided by 30 high school IT teachers. Results showed statistically significant improvements (<i>p</i> < 0.01) with Multi-Examiner outperforming GPT-4 by an average of 0.87 points (on a 5-point scale) for evaluation-level questions and 1.12 points for creation-level questions. The results demonstrated that: (i) overall, questions generated by the Multi-Examiner system outperformed those generated by GPT-4 across all dimensions and closely matched the quality of human-crafted questions in several dimensions; (ii) domain-specific search tools significantly enhanced the diversity of questions generated by Multi-Examiner; and (iii) GPT-4 generated better questions for knowledge points at the “remembering” and “understanding” levels, while Multi-Examiner significantly improved the higher-order thinking of questions for the “evaluating” and “creating” levels. This study contributes to the growing body of research on AI-supported educational assessment by demonstrating how specialized knowledge structures can enhance automated generation of higher-order thinking questions beyond what general-purpose language models can achieve.https://www.mdpi.com/2076-3417/15/10/5719question generationmulti-agent systemsknowledge graphslarge language modelsinformation technology educationBloom’s taxonomy
spellingShingle Yonggu Wang
Zeyu Yu
Zihan Wang
Zengyi Yu
Jue Wang
Multi-Examiner: A Knowledge Graph-Driven System for Generating Comprehensive IT Questions with Higher-Order Thinking
Applied Sciences
question generation
multi-agent systems
knowledge graphs
large language models
information technology education
Bloom’s taxonomy
title Multi-Examiner: A Knowledge Graph-Driven System for Generating Comprehensive IT Questions with Higher-Order Thinking
title_full Multi-Examiner: A Knowledge Graph-Driven System for Generating Comprehensive IT Questions with Higher-Order Thinking
title_fullStr Multi-Examiner: A Knowledge Graph-Driven System for Generating Comprehensive IT Questions with Higher-Order Thinking
title_full_unstemmed Multi-Examiner: A Knowledge Graph-Driven System for Generating Comprehensive IT Questions with Higher-Order Thinking
title_short Multi-Examiner: A Knowledge Graph-Driven System for Generating Comprehensive IT Questions with Higher-Order Thinking
title_sort multi examiner a knowledge graph driven system for generating comprehensive it questions with higher order thinking
topic question generation
multi-agent systems
knowledge graphs
large language models
information technology education
Bloom’s taxonomy
url https://www.mdpi.com/2076-3417/15/10/5719
work_keys_str_mv AT yongguwang multiexamineraknowledgegraphdrivensystemforgeneratingcomprehensiveitquestionswithhigherorderthinking
AT zeyuyu multiexamineraknowledgegraphdrivensystemforgeneratingcomprehensiveitquestionswithhigherorderthinking
AT zihanwang multiexamineraknowledgegraphdrivensystemforgeneratingcomprehensiveitquestionswithhigherorderthinking
AT zengyiyu multiexamineraknowledgegraphdrivensystemforgeneratingcomprehensiveitquestionswithhigherorderthinking
AT juewang multiexamineraknowledgegraphdrivensystemforgeneratingcomprehensiveitquestionswithhigherorderthinking