Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection
According to recent studies, young adults in India faced mental health issues due to closures of universities and loss of income, low self-esteem, distress, and reported symptoms of anxiety and/or depressive disorder (43%). This makes it a high time to come up with a solution. A new classifier propo...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2022-01-01
|
Series: | Journal of Mathematics |
Online Access: | http://dx.doi.org/10.1155/2022/3225920 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832550767362310144 |
---|---|
author | Carmel Mary Belinda M J Ravikumar S Muhammad Arif Dhilip Kumar V Antony Kumar K Arulkumaran G |
author_facet | Carmel Mary Belinda M J Ravikumar S Muhammad Arif Dhilip Kumar V Antony Kumar K Arulkumaran G |
author_sort | Carmel Mary Belinda M J |
collection | DOAJ |
description | According to recent studies, young adults in India faced mental health issues due to closures of universities and loss of income, low self-esteem, distress, and reported symptoms of anxiety and/or depressive disorder (43%). This makes it a high time to come up with a solution. A new classifier proposed to find those individuals who might be having depression based on their tweets from the social media platform Twitter. The proposed model is based on linguistic analysis and text classification by calculating probability using the TF∗IDF (term frequency-inverse document frequency). Indians tend to tweet predominantly using English, Hindi, or a mix of these two languages (colloquially known as Hinglish). In this proposed approach, data has been collected from Twitter and screened via passing them through a classifier built using the multinomial Naive Bayes algorithm and grid search, the latter being used for hyperparameter optimization. Each tweet is classified as depressed or not depressed. The entire architecture works over English and Hindi languages, which shall help in implementation globally and across multiple platforms and help in putting a stop to the ever-increasing depression rates in a methodical and automated manner. In the proposed model pipeline, composed techniques are used to get the better results, as 96.15% accuracy and 0.914 as the F1 score have been attained. |
format | Article |
id | doaj-art-9daa4241fdd2497fabda9e2f5c55b302 |
institution | Kabale University |
issn | 2314-4785 |
language | English |
publishDate | 2022-01-01 |
publisher | Wiley |
record_format | Article |
series | Journal of Mathematics |
spelling | doaj-art-9daa4241fdd2497fabda9e2f5c55b3022025-02-03T06:05:51ZengWileyJournal of Mathematics2314-47852022-01-01202210.1155/2022/3225920Linguistic Analysis of Hindi-English Mixed Tweets for Depression DetectionCarmel Mary Belinda M J0Ravikumar S1Muhammad Arif2Dhilip Kumar V3Antony Kumar K4Arulkumaran G5Department of Computer Science & EngineeringDepartment of Computer Science & EngineeringDepartment of Computer Science and Information TechnologyDepartment of Computer Science & EngineeringDepartment of Computer Science & EngineeringDepartment of Electrical and Computer EngineeringAccording to recent studies, young adults in India faced mental health issues due to closures of universities and loss of income, low self-esteem, distress, and reported symptoms of anxiety and/or depressive disorder (43%). This makes it a high time to come up with a solution. A new classifier proposed to find those individuals who might be having depression based on their tweets from the social media platform Twitter. The proposed model is based on linguistic analysis and text classification by calculating probability using the TF∗IDF (term frequency-inverse document frequency). Indians tend to tweet predominantly using English, Hindi, or a mix of these two languages (colloquially known as Hinglish). In this proposed approach, data has been collected from Twitter and screened via passing them through a classifier built using the multinomial Naive Bayes algorithm and grid search, the latter being used for hyperparameter optimization. Each tweet is classified as depressed or not depressed. The entire architecture works over English and Hindi languages, which shall help in implementation globally and across multiple platforms and help in putting a stop to the ever-increasing depression rates in a methodical and automated manner. In the proposed model pipeline, composed techniques are used to get the better results, as 96.15% accuracy and 0.914 as the F1 score have been attained.http://dx.doi.org/10.1155/2022/3225920 |
spellingShingle | Carmel Mary Belinda M J Ravikumar S Muhammad Arif Dhilip Kumar V Antony Kumar K Arulkumaran G Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection Journal of Mathematics |
title | Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection |
title_full | Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection |
title_fullStr | Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection |
title_full_unstemmed | Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection |
title_short | Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection |
title_sort | linguistic analysis of hindi english mixed tweets for depression detection |
url | http://dx.doi.org/10.1155/2022/3225920 |
work_keys_str_mv | AT carmelmarybelindamj linguisticanalysisofhindienglishmixedtweetsfordepressiondetection AT ravikumars linguisticanalysisofhindienglishmixedtweetsfordepressiondetection AT muhammadarif linguisticanalysisofhindienglishmixedtweetsfordepressiondetection AT dhilipkumarv linguisticanalysisofhindienglishmixedtweetsfordepressiondetection AT antonykumark linguisticanalysisofhindienglishmixedtweetsfordepressiondetection AT arulkumarang linguisticanalysisofhindienglishmixedtweetsfordepressiondetection |