Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection

According to recent studies, young adults in India faced mental health issues due to closures of universities and loss of income, low self-esteem, distress, and reported symptoms of anxiety and/or depressive disorder (43%). This makes it a high time to come up with a solution. A new classifier propo...

Full description

Saved in:
Bibliographic Details
Main Authors: Carmel Mary Belinda M J, Ravikumar S, Muhammad Arif, Dhilip Kumar V, Antony Kumar K, Arulkumaran G
Format: Article
Language:English
Published: Wiley 2022-01-01
Series:Journal of Mathematics
Online Access:http://dx.doi.org/10.1155/2022/3225920
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832550767362310144
author Carmel Mary Belinda M J
Ravikumar S
Muhammad Arif
Dhilip Kumar V
Antony Kumar K
Arulkumaran G
author_facet Carmel Mary Belinda M J
Ravikumar S
Muhammad Arif
Dhilip Kumar V
Antony Kumar K
Arulkumaran G
author_sort Carmel Mary Belinda M J
collection DOAJ
description According to recent studies, young adults in India faced mental health issues due to closures of universities and loss of income, low self-esteem, distress, and reported symptoms of anxiety and/or depressive disorder (43%). This makes it a high time to come up with a solution. A new classifier proposed to find those individuals who might be having depression based on their tweets from the social media platform Twitter. The proposed model is based on linguistic analysis and text classification by calculating probability using the TF∗IDF (term frequency-inverse document frequency). Indians tend to tweet predominantly using English, Hindi, or a mix of these two languages (colloquially known as Hinglish). In this proposed approach, data has been collected from Twitter and screened via passing them through a classifier built using the multinomial Naive Bayes algorithm and grid search, the latter being used for hyperparameter optimization. Each tweet is classified as depressed or not depressed. The entire architecture works over English and Hindi languages, which shall help in implementation globally and across multiple platforms and help in putting a stop to the ever-increasing depression rates in a methodical and automated manner. In the proposed model pipeline, composed techniques are used to get the better results, as 96.15% accuracy and 0.914 as the F1 score have been attained.
format Article
id doaj-art-9daa4241fdd2497fabda9e2f5c55b302
institution Kabale University
issn 2314-4785
language English
publishDate 2022-01-01
publisher Wiley
record_format Article
series Journal of Mathematics
spelling doaj-art-9daa4241fdd2497fabda9e2f5c55b3022025-02-03T06:05:51ZengWileyJournal of Mathematics2314-47852022-01-01202210.1155/2022/3225920Linguistic Analysis of Hindi-English Mixed Tweets for Depression DetectionCarmel Mary Belinda M J0Ravikumar S1Muhammad Arif2Dhilip Kumar V3Antony Kumar K4Arulkumaran G5Department of Computer Science & EngineeringDepartment of Computer Science & EngineeringDepartment of Computer Science and Information TechnologyDepartment of Computer Science & EngineeringDepartment of Computer Science & EngineeringDepartment of Electrical and Computer EngineeringAccording to recent studies, young adults in India faced mental health issues due to closures of universities and loss of income, low self-esteem, distress, and reported symptoms of anxiety and/or depressive disorder (43%). This makes it a high time to come up with a solution. A new classifier proposed to find those individuals who might be having depression based on their tweets from the social media platform Twitter. The proposed model is based on linguistic analysis and text classification by calculating probability using the TF∗IDF (term frequency-inverse document frequency). Indians tend to tweet predominantly using English, Hindi, or a mix of these two languages (colloquially known as Hinglish). In this proposed approach, data has been collected from Twitter and screened via passing them through a classifier built using the multinomial Naive Bayes algorithm and grid search, the latter being used for hyperparameter optimization. Each tweet is classified as depressed or not depressed. The entire architecture works over English and Hindi languages, which shall help in implementation globally and across multiple platforms and help in putting a stop to the ever-increasing depression rates in a methodical and automated manner. In the proposed model pipeline, composed techniques are used to get the better results, as 96.15% accuracy and 0.914 as the F1 score have been attained.http://dx.doi.org/10.1155/2022/3225920
spellingShingle Carmel Mary Belinda M J
Ravikumar S
Muhammad Arif
Dhilip Kumar V
Antony Kumar K
Arulkumaran G
Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection
Journal of Mathematics
title Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection
title_full Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection
title_fullStr Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection
title_full_unstemmed Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection
title_short Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection
title_sort linguistic analysis of hindi english mixed tweets for depression detection
url http://dx.doi.org/10.1155/2022/3225920
work_keys_str_mv AT carmelmarybelindamj linguisticanalysisofhindienglishmixedtweetsfordepressiondetection
AT ravikumars linguisticanalysisofhindienglishmixedtweetsfordepressiondetection
AT muhammadarif linguisticanalysisofhindienglishmixedtweetsfordepressiondetection
AT dhilipkumarv linguisticanalysisofhindienglishmixedtweetsfordepressiondetection
AT antonykumark linguisticanalysisofhindienglishmixedtweetsfordepressiondetection
AT arulkumarang linguisticanalysisofhindienglishmixedtweetsfordepressiondetection