A Framework to Predict the Quality of a Video for Popularity on Social Media

ABSTRACT YouTube has become a dominant force in digital media, yet current video popularity analytics remain limited in capturing the emotional and cultural dimensions of viewer engagement, particularly in underrepresented regions like Pakistan. While existing research focuses predominantly on Weste...

Full description

Saved in:
Bibliographic Details
Main Authors: Abqa Javed, Nimra Abid, Muhammad Shoaib, Muhammad Farrukh Shahzad, Fahad Sabah, Raheem Sarwar
Format: Article
Language:English
Published: Wiley 2025-06-01
Series:Engineering Reports
Subjects:
Online Access:https://doi.org/10.1002/eng2.70250
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849417619506987008
author Abqa Javed
Nimra Abid
Muhammad Shoaib
Muhammad Farrukh Shahzad
Fahad Sabah
Raheem Sarwar
author_facet Abqa Javed
Nimra Abid
Muhammad Shoaib
Muhammad Farrukh Shahzad
Fahad Sabah
Raheem Sarwar
author_sort Abqa Javed
collection DOAJ
description ABSTRACT YouTube has become a dominant force in digital media, yet current video popularity analytics remain limited in capturing the emotional and cultural dimensions of viewer engagement, particularly in underrepresented regions like Pakistan. While existing research focuses predominantly on Western markets and quantitative metrics (views, likes, comments), these approaches overlook sentiment‐driven interactions critical to understanding regional audience behavior. This study bridges this gap by introducing a sentiment‐aware framework for YouTube video classification in Pakistan, combining traditional popularity metrics with advanced sentiment analysis of user comments. We curated the PAK VIDEOS (2021–2023) dataset using YouTube Data APIs, comprising metadata and user comments from Pakistan's top trending videos. Leveraging Natural Language Processing (NLP) techniques, we extracted sentiment scores from comments to classify videos into four categories: non‐popular, overwhelmingly positive, overwhelmingly negative, and neutral. This hybrid approach enabled a nuanced evaluation of content reception beyond quantitative metrics. Four machine learning models—random forest, stochastic gradient descent classifier (SGDC), gradient boosting, and XGBoost—were evaluated for classification. XGBoost achieved superior performance (84.3% accuracy), outperforming baseline models by up to 20%. Our framework demonstrates that integrating sentiment analysis significantly enhances popularity prediction, particularly in culturally distinct contexts.
format Article
id doaj-art-438ede7ad18245c6a5ec7ebb9ee2be5e
institution Kabale University
issn 2577-8196
language English
publishDate 2025-06-01
publisher Wiley
record_format Article
series Engineering Reports
spelling doaj-art-438ede7ad18245c6a5ec7ebb9ee2be5e2025-08-20T03:32:45ZengWileyEngineering Reports2577-81962025-06-0176n/an/a10.1002/eng2.70250A Framework to Predict the Quality of a Video for Popularity on Social MediaAbqa Javed0Nimra Abid1Muhammad Shoaib2Muhammad Farrukh Shahzad3Fahad Sabah4Raheem Sarwar5Department of Computer Science University of Engineering and Technology Lahore PakistanDepartment of Computer Science University of Engineering and Technology Lahore PakistanDepartment of Computer Science University of Engineering and Technology Lahore PakistanCollege of Economics & Management Beijing University of Technology Beijing ChinaCollege of Computer Science Beijing University of Technology Beijing ChinaOTEHM, Faculty of Business and Law Manchester Metropolitan University Manchester UKABSTRACT YouTube has become a dominant force in digital media, yet current video popularity analytics remain limited in capturing the emotional and cultural dimensions of viewer engagement, particularly in underrepresented regions like Pakistan. While existing research focuses predominantly on Western markets and quantitative metrics (views, likes, comments), these approaches overlook sentiment‐driven interactions critical to understanding regional audience behavior. This study bridges this gap by introducing a sentiment‐aware framework for YouTube video classification in Pakistan, combining traditional popularity metrics with advanced sentiment analysis of user comments. We curated the PAK VIDEOS (2021–2023) dataset using YouTube Data APIs, comprising metadata and user comments from Pakistan's top trending videos. Leveraging Natural Language Processing (NLP) techniques, we extracted sentiment scores from comments to classify videos into four categories: non‐popular, overwhelmingly positive, overwhelmingly negative, and neutral. This hybrid approach enabled a nuanced evaluation of content reception beyond quantitative metrics. Four machine learning models—random forest, stochastic gradient descent classifier (SGDC), gradient boosting, and XGBoost—were evaluated for classification. XGBoost achieved superior performance (84.3% accuracy), outperforming baseline models by up to 20%. Our framework demonstrates that integrating sentiment analysis significantly enhances popularity prediction, particularly in culturally distinct contexts.https://doi.org/10.1002/eng2.70250Google Apps ScriptNatural Language Processingopinion miningsocial media analysisvideo classificationYouTube data APIs
spellingShingle Abqa Javed
Nimra Abid
Muhammad Shoaib
Muhammad Farrukh Shahzad
Fahad Sabah
Raheem Sarwar
A Framework to Predict the Quality of a Video for Popularity on Social Media
Engineering Reports
Google Apps Script
Natural Language Processing
opinion mining
social media analysis
video classification
YouTube data APIs
title A Framework to Predict the Quality of a Video for Popularity on Social Media
title_full A Framework to Predict the Quality of a Video for Popularity on Social Media
title_fullStr A Framework to Predict the Quality of a Video for Popularity on Social Media
title_full_unstemmed A Framework to Predict the Quality of a Video for Popularity on Social Media
title_short A Framework to Predict the Quality of a Video for Popularity on Social Media
title_sort framework to predict the quality of a video for popularity on social media
topic Google Apps Script
Natural Language Processing
opinion mining
social media analysis
video classification
YouTube data APIs
url https://doi.org/10.1002/eng2.70250
work_keys_str_mv AT abqajaved aframeworktopredictthequalityofavideoforpopularityonsocialmedia
AT nimraabid aframeworktopredictthequalityofavideoforpopularityonsocialmedia
AT muhammadshoaib aframeworktopredictthequalityofavideoforpopularityonsocialmedia
AT muhammadfarrukhshahzad aframeworktopredictthequalityofavideoforpopularityonsocialmedia
AT fahadsabah aframeworktopredictthequalityofavideoforpopularityonsocialmedia
AT raheemsarwar aframeworktopredictthequalityofavideoforpopularityonsocialmedia
AT abqajaved frameworktopredictthequalityofavideoforpopularityonsocialmedia
AT nimraabid frameworktopredictthequalityofavideoforpopularityonsocialmedia
AT muhammadshoaib frameworktopredictthequalityofavideoforpopularityonsocialmedia
AT muhammadfarrukhshahzad frameworktopredictthequalityofavideoforpopularityonsocialmedia
AT fahadsabah frameworktopredictthequalityofavideoforpopularityonsocialmedia
AT raheemsarwar frameworktopredictthequalityofavideoforpopularityonsocialmedia