Knowledge-based Word Tokenization System for Urdu

Knowledge-based Word Tokenization System for Urdu

Word tokenization, a foundational step in natural language processing (NLP), is critical for tasks like part-of-speech tagging, named entity recognition, and parsing, as well as various independent NLP applications. In our tech-driven era, the exponential growth of textual data on the World Wide Web...

Full description

Saved in:

Bibliographic Details
Main Authors:	Asif Khan, Khairullah Khan, Wahab Khan, Sadiq Nawaz Khan, Rafiul Haq
Format:	Article
Language:	English
Published:	MMU Press 2024-06-01
Series:	Journal of Informatics and Web Engineering
Subjects:	natural language processing (nlp) urdu language processing (ulp)) forward maximum matching (fmm) reverse maximum matching (rmm) part-of-speech tagging (pos)
Online Access:	https://journals.mmupress.com/index.php/jiwe/article/view/902
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Paraphrase detection for Urdu language text using fine-tune BiLSTM framework
by: Muhammad Ali Aslam, et al.
Published: (2025-05-01)

A Computational Approach to Understanding Agglutinative Structures in Urdu
by: Muhammad Shoaib Tahir, et al.
Published: (2024-09-01)

Automatic grammatical tagger for a Spanish–Mixtec parallel corpus
by: Hermilo Santiago-Benito, et al.
Published: (2025-02-01)

UrduSER: A comprehensive dataset for speech emotion recognition in Urdu languageMendeley Data
by: Muhammad Zaheer Akhtar, et al.
Published: (2025-06-01)

Selected Literary and Resourceful Websites of Urdu: A Survey
by: Muhammad Mohsin Khan, et al.
Published: (2021-12-01)

UAlpha40: A comprehensive dataset of Urdu alphabet for Pakistan sign languageMendeley Data
by: Sameena Javaid, et al.
Published: (2025-04-01)

Deep Learning Based Cross Domain Sentiment Classification for Urdu Language
by: Amna Altaf, et al.
Published: (2022-01-01)

Consonantal Variation of Hindi-Urdu Loanwords in Standard English: A Phonological Analysis
by: Bairam Khan, et al.
Published: (2024-11-01)

Travelogues of Turkiye in Urdu: Analysis and Index
by: Arzu CIFTSUREN
Published: (2024-07-01)

Accuracy improvement in financial sanction screening: is natural language processing the solution?
by: Seihee Kim, et al.
Published: (2024-11-01)

An Intelligent Optimization System Using Neural Networks and Soft Computing for the FMM Etching Process
by: Wen-Chin Chen, et al.
Published: (2025-06-01)

An Important Milestone of Modern Urdu Naat: Abdul Aziz Khalid
by: Asif Ali Chatha
Published: (2020-06-01)

A Comprehensive Survey on Urdu Hate Speech Detection: Methods, Evaluation, and Challenges
by: Ijaz Hussain, et al.
Published: (2025-01-01)

Urdu Toxic Comment Classification With PURUTT Corpus Development
by: Hafiz Hassaan Saeed, et al.
Published: (2025-01-01)

UEF-HOCUrdu: Unified Embeddings Ensemble Framework for Hate and Offensive Text Classification in Urdu
by: Kifayat Ullah, et al.
Published: (2025-01-01)

Contextual video analytics and recommendations through natural language processing (NLP) and graph machine learning (GML)
by: Umm-e-Laila, et al.
Published: (2025-07-01)

UMEDNet: a multimodal approach for emotion detection in the Urdu language
by: Adil Majeed, et al.
Published: (2025-05-01)

Development and Evaluation of Learning Portfolio Query System Based on LangChain Framework
by: Nien-Lin Hsueh, et al.
Published: (2025-04-01)

Communication grouping method of diversified tasks based on maximum matching of weighted bipartite graph
by: WANG Shijie, YANG Ruopeng
Published: (2025-08-01)

State of Art for Semantic Analysis of Natural Language Processing
by: Dastan Hussen Maulud, et al.
Published: (2021-03-01)

Comparative Evaluation of Sequential Neural Network (GRU, LSTM, Transformer) Within Siamese Networks for Enhanced Job–Candidate Matching in Applied Recruitment Systems
by: Mateusz Łępicki, et al.
Published: (2025-05-01)

Empowering Individuals With Visual Impairment: A Digital Braille Solution for Learning the Urdu Language
by: Farzana Jabeen, et al.
Published: (2025-01-01)

A dataset of Roman Urdu text with spelling variations for sentence level sentiment analysisMendeley Data
by: Mudasar Ahmed Soomro, et al.
Published: (2024-12-01)

Creating non-fungible token (NFT)-backed emoji art from user conversations on blockchain
by: Maedeh Mosharraf, et al.
Published: (2025-03-01)

Prosodic Analysis: It's Symbolic Method, Recommendations and Principles for Urdu
by: Javed Iqbal Qazi
Published: (2022-06-01)

Conversational agents in language learning
by: Xiao Feiwen, et al.
Published: (2023-03-01)

Optimized Identification of Sentence-Level Multiclass Events on Urdu-Language-Text Using Machine Learning Techniques
by: Somia Ali, et al.
Published: (2025-01-01)

The Moral Philosophy of Four Elements of Modern Urdu Ghazal
by: Zartashia Sagheer, et al.
Published: (2020-12-01)

The Consciousness and Urdu Ghazal
by: Saleem Akhtar, et al.
Published: (2020-12-01)

ChatGPT Practices: Finance and Banking Domain
by: Md Hassan, et al.
Published: (2024-12-01)

BanglaHealth: A Bengali paraphrase dataset on health domainHugging Face
by: Faisal Ibn Aziz, et al.
Published: (2025-08-01)

Developing a Multi-Layer Ontology Construction Framework for Arabic Language Processing: Focus on Figurative Language Potential
by: Zouheir Banou, et al.
Published: (2025-01-01)

Retrospective Comparative Analysis of Prostate Cancer In-Basket Messages: Responses From Closed-Domain Large Language Models Versus Clinical Teams
by: Yuexing Hao, MS, et al.
Published: (2025-03-01)

Developing a Hybrid Morphological Analyzer for Low-Resource Languages
by: Musica Supriya, et al.
Published: (2025-05-01)

Using Graph-Based Maximum Independent Sets with Large Language Models for Extractive Text Summarization
by: Cengiz Hark
Published: (2025-06-01)

Analysis and Modeling of Statistical Distribution Characteristics for Multi-Aspect SAR Images
by: Rui Zhu, et al.
Published: (2025-04-01)

Highly Parallel Regular Expression Matching Using a Real Processing-in-Memory System
by: Jeonghyeon Joo, et al.
Published: (2025-01-01)

On the goodness-of-fits of the generalized lambda distribution on high-frequency stock index returns
by: Peterson Owusu Junior, et al.
Published: (2022-12-01)

Handover Strategy for LEO Satellite Networks Using Bipartite Graph and Hysteresis Margin
by: Sahar Eydian, et al.
Published: (2025-01-01)

Arabic medical entity tagging using distant learning
by: Viviana Cotik, et al.
Published: (2017-04-01)