Analysis and Prediction of CET4 Scores Based on Data Mining Algorithm

This paper presents the concept and algorithm of data mining and focuses on the linear regression algorithm. Based on the multiple linear regression algorithm, many factors affecting CET4 are analyzed. Ideas based on data mining, collecting history data and appropriate to transform, using statistica...

Full description

Saved in:
Bibliographic Details
Main Author: Hongyan Wang
Format: Article
Language:English
Published: Wiley 2021-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2021/5577868
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849693398977478656
author Hongyan Wang
author_facet Hongyan Wang
author_sort Hongyan Wang
collection DOAJ
description This paper presents the concept and algorithm of data mining and focuses on the linear regression algorithm. Based on the multiple linear regression algorithm, many factors affecting CET4 are analyzed. Ideas based on data mining, collecting history data and appropriate to transform, using statistical analysis techniques to the many factors influencing the CET-4 test were analyzed, and we have obtained the CET-4 test result and its influencing factors. It was found that the linear regression relationship between the degrees of fit was relatively high. We further improve the algorithm and establish a partition-weighted K-nearest neighbor algorithm. The K-weighted K nearest neighbor algorithm and the partition algorithm are used in the CET-4 test score classification prediction, and the statistical method is used to study the relevant factors that affect the CET-4 test score, and screen classification is performed to predict when the comparison verification will pass. The weight K of the input feature and the adjacent feature are weighted, although the allocation algorithm of the adjacent classification effect has not been significantly improved, but the stability classification is better than K-nearest neighbor algorithm, its classification efficiency is greatly improved, classification time is greatly reduced, and classification efficiency is increased by 119%. In order to detect potential risk graduating students earlier, this paper proposes an appropriate and timely early warning and preschool K-nearest neighbor algorithm classification model. Taking test scores or make-up exams and re-learning as input features, the classification model can effectively predict ordinary students who have not graduated.
format Article
id doaj-art-2dfc8c2972304d46b1c0200d873b3119
institution DOAJ
issn 1076-2787
1099-0526
language English
publishDate 2021-01-01
publisher Wiley
record_format Article
series Complexity
spelling doaj-art-2dfc8c2972304d46b1c0200d873b31192025-08-20T03:20:26ZengWileyComplexity1076-27871099-05262021-01-01202110.1155/2021/55778685577868Analysis and Prediction of CET4 Scores Based on Data Mining AlgorithmHongyan Wang0School of Foreign Languages, Xi’an University of Finance and Economics, Xi’an 710000, ChinaThis paper presents the concept and algorithm of data mining and focuses on the linear regression algorithm. Based on the multiple linear regression algorithm, many factors affecting CET4 are analyzed. Ideas based on data mining, collecting history data and appropriate to transform, using statistical analysis techniques to the many factors influencing the CET-4 test were analyzed, and we have obtained the CET-4 test result and its influencing factors. It was found that the linear regression relationship between the degrees of fit was relatively high. We further improve the algorithm and establish a partition-weighted K-nearest neighbor algorithm. The K-weighted K nearest neighbor algorithm and the partition algorithm are used in the CET-4 test score classification prediction, and the statistical method is used to study the relevant factors that affect the CET-4 test score, and screen classification is performed to predict when the comparison verification will pass. The weight K of the input feature and the adjacent feature are weighted, although the allocation algorithm of the adjacent classification effect has not been significantly improved, but the stability classification is better than K-nearest neighbor algorithm, its classification efficiency is greatly improved, classification time is greatly reduced, and classification efficiency is increased by 119%. In order to detect potential risk graduating students earlier, this paper proposes an appropriate and timely early warning and preschool K-nearest neighbor algorithm classification model. Taking test scores or make-up exams and re-learning as input features, the classification model can effectively predict ordinary students who have not graduated.http://dx.doi.org/10.1155/2021/5577868
spellingShingle Hongyan Wang
Analysis and Prediction of CET4 Scores Based on Data Mining Algorithm
Complexity
title Analysis and Prediction of CET4 Scores Based on Data Mining Algorithm
title_full Analysis and Prediction of CET4 Scores Based on Data Mining Algorithm
title_fullStr Analysis and Prediction of CET4 Scores Based on Data Mining Algorithm
title_full_unstemmed Analysis and Prediction of CET4 Scores Based on Data Mining Algorithm
title_short Analysis and Prediction of CET4 Scores Based on Data Mining Algorithm
title_sort analysis and prediction of cet4 scores based on data mining algorithm
url http://dx.doi.org/10.1155/2021/5577868
work_keys_str_mv AT hongyanwang analysisandpredictionofcet4scoresbasedondataminingalgorithm