LLMs in Education: Evaluation GPT and BERT Models in Student Comment Classification

The incorporation of artificial intelligence in educational contexts has significantly transformed the support provided to students facing learning difficulties, facilitating both the management of their educational process and their emotions. Additionally, online comments play a vital role in under...

Full description

Saved in:
Bibliographic Details
Main Authors: Anabel Pilicita, Enrique Barra
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Multimodal Technologies and Interaction
Subjects:
Online Access:https://www.mdpi.com/2414-4088/9/5/44
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849326854221070336
author Anabel Pilicita
Enrique Barra
author_facet Anabel Pilicita
Enrique Barra
author_sort Anabel Pilicita
collection DOAJ
description The incorporation of artificial intelligence in educational contexts has significantly transformed the support provided to students facing learning difficulties, facilitating both the management of their educational process and their emotions. Additionally, online comments play a vital role in understanding student feelings. Analyzing comments on social media platforms can help identify students in vulnerable situations so that timely interventions can be implemented. However, manually analyzing student-generated content on social media platforms is challenging due to the large amount of data and the frequency with which it is posted. In this sense, the recent revolution in artificial intelligence, marked by the implementation of powerful large language models (LLMs), may contribute to the classification of student comments. This study compared the effectiveness of a supervised learning approach using five different LLMs: bert-base-uncased, roberta-base, gpt-4o-mini-2024-07-18, gpt-3.5-turbo-0125, and gpt-neo-125m. The evaluation was carried out after fine-tuning them specifically to classify student comments on social media platforms with anxiety/depression or neutral labels. The results obtained were as follows: gpt-4o-mini-2024-07-18 and gpt-3.5-turbo-0125 obtained 98.93%, roberta-base 98.14%, bert-base-uncased 97.13%, and gpt-neo-125m 96.43%. Therefore, when comparing the effectiveness of these models, it was determined that all LLMs performed well in this classification task.
format Article
id doaj-art-9fae3031873745c0b504bb9cdcaf3a11
institution Kabale University
issn 2414-4088
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Multimodal Technologies and Interaction
spelling doaj-art-9fae3031873745c0b504bb9cdcaf3a112025-08-20T03:48:02ZengMDPI AGMultimodal Technologies and Interaction2414-40882025-05-01954410.3390/mti9050044LLMs in Education: Evaluation GPT and BERT Models in Student Comment ClassificationAnabel Pilicita0Enrique Barra1Departamento de Ingeniería de Sistemas Telemáticos, Escuela Técnica Superior de Ingenieros de Telecomunicación, Universidad Politécnica de Madrid, 28040 Madrid, SpainDepartamento de Ingeniería de Sistemas Telemáticos, Escuela Técnica Superior de Ingenieros de Telecomunicación, Universidad Politécnica de Madrid, 28040 Madrid, SpainThe incorporation of artificial intelligence in educational contexts has significantly transformed the support provided to students facing learning difficulties, facilitating both the management of their educational process and their emotions. Additionally, online comments play a vital role in understanding student feelings. Analyzing comments on social media platforms can help identify students in vulnerable situations so that timely interventions can be implemented. However, manually analyzing student-generated content on social media platforms is challenging due to the large amount of data and the frequency with which it is posted. In this sense, the recent revolution in artificial intelligence, marked by the implementation of powerful large language models (LLMs), may contribute to the classification of student comments. This study compared the effectiveness of a supervised learning approach using five different LLMs: bert-base-uncased, roberta-base, gpt-4o-mini-2024-07-18, gpt-3.5-turbo-0125, and gpt-neo-125m. The evaluation was carried out after fine-tuning them specifically to classify student comments on social media platforms with anxiety/depression or neutral labels. The results obtained were as follows: gpt-4o-mini-2024-07-18 and gpt-3.5-turbo-0125 obtained 98.93%, roberta-base 98.14%, bert-base-uncased 97.13%, and gpt-neo-125m 96.43%. Therefore, when comparing the effectiveness of these models, it was determined that all LLMs performed well in this classification task.https://www.mdpi.com/2414-4088/9/5/44LLMsNLPtransformerseducationBERTGPT
spellingShingle Anabel Pilicita
Enrique Barra
LLMs in Education: Evaluation GPT and BERT Models in Student Comment Classification
Multimodal Technologies and Interaction
LLMs
NLP
transformers
education
BERT
GPT
title LLMs in Education: Evaluation GPT and BERT Models in Student Comment Classification
title_full LLMs in Education: Evaluation GPT and BERT Models in Student Comment Classification
title_fullStr LLMs in Education: Evaluation GPT and BERT Models in Student Comment Classification
title_full_unstemmed LLMs in Education: Evaluation GPT and BERT Models in Student Comment Classification
title_short LLMs in Education: Evaluation GPT and BERT Models in Student Comment Classification
title_sort llms in education evaluation gpt and bert models in student comment classification
topic LLMs
NLP
transformers
education
BERT
GPT
url https://www.mdpi.com/2414-4088/9/5/44
work_keys_str_mv AT anabelpilicita llmsineducationevaluationgptandbertmodelsinstudentcommentclassification
AT enriquebarra llmsineducationevaluationgptandbertmodelsinstudentcommentclassification