Leveraging autoencoder models and data augmentation to uncover transcriptomic diversity of gingival keratinocytes in single cell analysis

Abstract Periodontitis, a chronic inflammatory condition of the periodontium, is associated with over 60 systemic diseases. Despite advancements, precision medicine approaches have had limited success, emphasizing the need for deeper insights into cellular subpopulations and structural immunity, par...

Full description

Saved in:
Bibliographic Details
Main Authors: Pradeep Kumar Yadalam, Prabhu Manickam Natarajan, Carlos M. Ardila
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-08027-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849335251501842432
author Pradeep Kumar Yadalam
Prabhu Manickam Natarajan
Carlos M. Ardila
author_facet Pradeep Kumar Yadalam
Prabhu Manickam Natarajan
Carlos M. Ardila
author_sort Pradeep Kumar Yadalam
collection DOAJ
description Abstract Periodontitis, a chronic inflammatory condition of the periodontium, is associated with over 60 systemic diseases. Despite advancements, precision medicine approaches have had limited success, emphasizing the need for deeper insights into cellular subpopulations and structural immunity, particularly gingival keratinocytes. This study employs autoencoder models and data augmentation techniques to explore the transcriptomic diversity of gingival keratinocytes at the single-cell level. Single-cell RNA sequencing data from GSE266897 were processed using the Scanpy library, with quality control implemented to filter cells based on predefined metrics. Clustering was performed using principal component analysis (PCA) and k-nearest neighbor (KNN) algorithms. Marker gene identification and differential expression analysis were used to characterize cell clusters. Visualization techniques, including UMAP, heatmaps, dot plots, and violin plots, provided insights into gene expression patterns. The autoencoder architecture featured an encoder reducing input size to 256 units with ReLU activation, a bottleneck layer, and a decoder restoring data dimensions. The basic Autoencoder (AE) demonstrated superior performance, achieving the lowest loss (0.725), the highest accuracy (0.695), and minimal false positives. The Test-Time Augmentation AE also performed robustly, achieving an F1 score of 0.642 and an AUC-ROC of 0.800. The Basic AE effectively modeled RNA-seq data complexity compared to Variational and Denoising Autoencoders. This study highlights advanced computational techniques to investigate gingival keratinocytes’ transcriptomic diversity, revealing distinct subpopulations and differential gene expression profiles. These findings underscore the active role of keratinocytes in periodontal health and inflammatory responses, contributing to precision medicine approaches in periodontology.
format Article
id doaj-art-e51807e98d9842c3b1d514b9df84c27d
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-e51807e98d9842c3b1d514b9df84c27d2025-08-20T03:45:20ZengNature PortfolioScientific Reports2045-23222025-07-0115111910.1038/s41598-025-08027-wLeveraging autoencoder models and data augmentation to uncover transcriptomic diversity of gingival keratinocytes in single cell analysisPradeep Kumar Yadalam0Prabhu Manickam Natarajan1Carlos M. Ardila2Department of Periodontics, Saveetha Institute of Medical and Technical Sciences, Saveetha Dental College and Hospital, Saveetha UniversityDepartment of Clinical Sciences, Center of Medical and Bio-allied Health Sciences and Research, College of Dentistry, Ajman UniversityBasic Sciences Department, Faculty of Dentistry, Biomedical Stomatology Research Group, Universidad de Antioquia U de AAbstract Periodontitis, a chronic inflammatory condition of the periodontium, is associated with over 60 systemic diseases. Despite advancements, precision medicine approaches have had limited success, emphasizing the need for deeper insights into cellular subpopulations and structural immunity, particularly gingival keratinocytes. This study employs autoencoder models and data augmentation techniques to explore the transcriptomic diversity of gingival keratinocytes at the single-cell level. Single-cell RNA sequencing data from GSE266897 were processed using the Scanpy library, with quality control implemented to filter cells based on predefined metrics. Clustering was performed using principal component analysis (PCA) and k-nearest neighbor (KNN) algorithms. Marker gene identification and differential expression analysis were used to characterize cell clusters. Visualization techniques, including UMAP, heatmaps, dot plots, and violin plots, provided insights into gene expression patterns. The autoencoder architecture featured an encoder reducing input size to 256 units with ReLU activation, a bottleneck layer, and a decoder restoring data dimensions. The basic Autoencoder (AE) demonstrated superior performance, achieving the lowest loss (0.725), the highest accuracy (0.695), and minimal false positives. The Test-Time Augmentation AE also performed robustly, achieving an F1 score of 0.642 and an AUC-ROC of 0.800. The Basic AE effectively modeled RNA-seq data complexity compared to Variational and Denoising Autoencoders. This study highlights advanced computational techniques to investigate gingival keratinocytes’ transcriptomic diversity, revealing distinct subpopulations and differential gene expression profiles. These findings underscore the active role of keratinocytes in periodontal health and inflammatory responses, contributing to precision medicine approaches in periodontology.https://doi.org/10.1038/s41598-025-08027-wSingle-cell RNA sequencingAutoencoder modelsTranscriptomicsPeriodontal diseaseKeratinocyte diversityDeep learning
spellingShingle Pradeep Kumar Yadalam
Prabhu Manickam Natarajan
Carlos M. Ardila
Leveraging autoencoder models and data augmentation to uncover transcriptomic diversity of gingival keratinocytes in single cell analysis
Scientific Reports
Single-cell RNA sequencing
Autoencoder models
Transcriptomics
Periodontal disease
Keratinocyte diversity
Deep learning
title Leveraging autoencoder models and data augmentation to uncover transcriptomic diversity of gingival keratinocytes in single cell analysis
title_full Leveraging autoencoder models and data augmentation to uncover transcriptomic diversity of gingival keratinocytes in single cell analysis
title_fullStr Leveraging autoencoder models and data augmentation to uncover transcriptomic diversity of gingival keratinocytes in single cell analysis
title_full_unstemmed Leveraging autoencoder models and data augmentation to uncover transcriptomic diversity of gingival keratinocytes in single cell analysis
title_short Leveraging autoencoder models and data augmentation to uncover transcriptomic diversity of gingival keratinocytes in single cell analysis
title_sort leveraging autoencoder models and data augmentation to uncover transcriptomic diversity of gingival keratinocytes in single cell analysis
topic Single-cell RNA sequencing
Autoencoder models
Transcriptomics
Periodontal disease
Keratinocyte diversity
Deep learning
url https://doi.org/10.1038/s41598-025-08027-w
work_keys_str_mv AT pradeepkumaryadalam leveragingautoencodermodelsanddataaugmentationtouncovertranscriptomicdiversityofgingivalkeratinocytesinsinglecellanalysis
AT prabhumanickamnatarajan leveragingautoencodermodelsanddataaugmentationtouncovertranscriptomicdiversityofgingivalkeratinocytesinsinglecellanalysis
AT carlosmardila leveragingautoencodermodelsanddataaugmentationtouncovertranscriptomicdiversityofgingivalkeratinocytesinsinglecellanalysis