Electronic Health Record classification and analysis using NLP Techniques

This paper presents an automated system for the classification and analysis of Electronic Health Records (EHRs) using Natural Language Processing (NLP) techniques. The proposed solution integrates text extraction from PDFs and NLP methods to identify and classify EHR content effectively. By leveragi...

Full description

Saved in:
Bibliographic Details
Main Authors: Himavamshi K., Tejaswini D., Sethi Gaurav, Anusuya Devi V.S, Pavani P., Hariharan Shanmugasundaram
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:E3S Web of Conferences
Subjects:
Online Access:https://www.e3s-conferences.org/articles/e3sconf/pdf/2025/19/e3sconf_icsget2025_03016.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849766659474063360
author Himavamshi K.
Tejaswini D.
Sethi Gaurav
Anusuya Devi V.S
Pavani P.
Hariharan Shanmugasundaram
author_facet Himavamshi K.
Tejaswini D.
Sethi Gaurav
Anusuya Devi V.S
Pavani P.
Hariharan Shanmugasundaram
author_sort Himavamshi K.
collection DOAJ
description This paper presents an automated system for the classification and analysis of Electronic Health Records (EHRs) using Natural Language Processing (NLP) techniques. The proposed solution integrates text extraction from PDFs and NLP methods to identify and classify EHR content effectively. By leveraging Python libraries such as PyMuPDF for text extraction and applying NLP preprocessing techniques, the system can handle both structured and unstructured data, providing enhanced accuracy in EHR identification. The approach is validated using a set of EHR and non-EHR documents, achieving promising results in classification accuracy.
format Article
id doaj-art-4c73bb8e722c47adb509997168fd4bca
institution DOAJ
issn 2267-1242
language English
publishDate 2025-01-01
publisher EDP Sciences
record_format Article
series E3S Web of Conferences
spelling doaj-art-4c73bb8e722c47adb509997168fd4bca2025-08-20T03:04:30ZengEDP SciencesE3S Web of Conferences2267-12422025-01-016190301610.1051/e3sconf/202561903016e3sconf_icsget2025_03016Electronic Health Record classification and analysis using NLP TechniquesHimavamshi K.0Tejaswini D.1Sethi Gaurav2Anusuya Devi V.S3Pavani P.4Hariharan Shanmugasundaram5Department of Artificial Intelligence and Data Science, Vardhaman College of EngineeringDepartment of Artificial Intelligence and Data Science, Vardhaman College of EngineeringDepartment of Artificial Intelligence and Data Science, Lovely Professional UniversityDepartment of Applied Sciences, New Horizon College of EngineeringDepartment of Artificial Intelligence and Data Science, Vardhaman College of EngineeringDepartment of Artificial Intelligence and Data Science, Vardhaman College of EngineeringThis paper presents an automated system for the classification and analysis of Electronic Health Records (EHRs) using Natural Language Processing (NLP) techniques. The proposed solution integrates text extraction from PDFs and NLP methods to identify and classify EHR content effectively. By leveraging Python libraries such as PyMuPDF for text extraction and applying NLP preprocessing techniques, the system can handle both structured and unstructured data, providing enhanced accuracy in EHR identification. The approach is validated using a set of EHR and non-EHR documents, achieving promising results in classification accuracy.https://www.e3s-conferences.org/articles/e3sconf/pdf/2025/19/e3sconf_icsget2025_03016.pdfehrnlppymupdfstopwordhealthcare
spellingShingle Himavamshi K.
Tejaswini D.
Sethi Gaurav
Anusuya Devi V.S
Pavani P.
Hariharan Shanmugasundaram
Electronic Health Record classification and analysis using NLP Techniques
E3S Web of Conferences
ehr
nlp
pymupdf
stopword
healthcare
title Electronic Health Record classification and analysis using NLP Techniques
title_full Electronic Health Record classification and analysis using NLP Techniques
title_fullStr Electronic Health Record classification and analysis using NLP Techniques
title_full_unstemmed Electronic Health Record classification and analysis using NLP Techniques
title_short Electronic Health Record classification and analysis using NLP Techniques
title_sort electronic health record classification and analysis using nlp techniques
topic ehr
nlp
pymupdf
stopword
healthcare
url https://www.e3s-conferences.org/articles/e3sconf/pdf/2025/19/e3sconf_icsget2025_03016.pdf
work_keys_str_mv AT himavamshik electronichealthrecordclassificationandanalysisusingnlptechniques
AT tejaswinid electronichealthrecordclassificationandanalysisusingnlptechniques
AT sethigaurav electronichealthrecordclassificationandanalysisusingnlptechniques
AT anusuyadevivs electronichealthrecordclassificationandanalysisusingnlptechniques
AT pavanip electronichealthrecordclassificationandanalysisusingnlptechniques
AT hariharanshanmugasundaram electronichealthrecordclassificationandanalysisusingnlptechniques