Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis

Data collection and data preprocessing are crucial stages in web usage mining, mainly because of the unstructured, diverse, and noisy nature of log data. During data collection, log file datasets are loaded and merged. Effective and comprehensive data preprocessing plays a vital role in ensuring the...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohammed Ali Mohammed, Rula A. Hamid, Reem Razzaq AbdulHussein
Format: Article
Language:Arabic
Published: University of Information Technology and Communications 2024-11-01
Series:Iraqi Journal for Computers and Informatics
Subjects:
Online Access:https://ijci.uoitc.edu.iq/index.php/ijci/article/view/486
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846165912733876224
author Mohammed Ali Mohammed
Rula A. Hamid
Reem Razzaq AbdulHussein
author_facet Mohammed Ali Mohammed
Rula A. Hamid
Reem Razzaq AbdulHussein
author_sort Mohammed Ali Mohammed
collection DOAJ
description Data collection and data preprocessing are crucial stages in web usage mining, mainly because of the unstructured, diverse, and noisy nature of log data. During data collection, log file datasets are loaded and merged. Effective and comprehensive data preprocessing plays a vital role in ensuring the efficiency and scalability of algorithms used in the pattern discovery phase of web usage mining. This work aims to address these phases by introducing two innovative approaches. The first approach focuses on determining the device used for accessing the web, distinguishing between computers and mobile devices. The second approach aims to determine user sessions and complete paths by utilizing the referrer URL. The entire preprocessing pipeline has been implemented using the C# programming language, and the source code is available on GitHub at the following link: https://github.com/Mohammed91/Web-Usage-Mining.
format Article
id doaj-art-7479ba29dac6481cab19ccb02a34b5e0
institution Kabale University
issn 2313-190X
2520-4912
language Arabic
publishDate 2024-11-01
publisher University of Information Technology and Communications
record_format Article
series Iraqi Journal for Computers and Informatics
spelling doaj-art-7479ba29dac6481cab19ccb02a34b5e02024-11-16T19:50:45ZaraUniversity of Information Technology and CommunicationsIraqi Journal for Computers and Informatics2313-190X2520-49122024-11-01502547410.25195/ijci.v50i2.486449Data Collection and Preprocessing in Web Usage Mining: Implementation and AnalysisMohammed Ali Mohammed0Rula A. Hamid1Reem Razzaq AbdulHussein2University of Information Technology and Communication University of Information Technology and CommunicationsUniversity of Information Technology and CommunicationsData collection and data preprocessing are crucial stages in web usage mining, mainly because of the unstructured, diverse, and noisy nature of log data. During data collection, log file datasets are loaded and merged. Effective and comprehensive data preprocessing plays a vital role in ensuring the efficiency and scalability of algorithms used in the pattern discovery phase of web usage mining. This work aims to address these phases by introducing two innovative approaches. The first approach focuses on determining the device used for accessing the web, distinguishing between computers and mobile devices. The second approach aims to determine user sessions and complete paths by utilizing the referrer URL. The entire preprocessing pipeline has been implemented using the C# programming language, and the source code is available on GitHub at the following link: https://github.com/Mohammed91/Web-Usage-Mining.https://ijci.uoitc.edu.iq/index.php/ijci/article/view/486web usage miningaccess log filedata collectiondata preprocessing
spellingShingle Mohammed Ali Mohammed
Rula A. Hamid
Reem Razzaq AbdulHussein
Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis
Iraqi Journal for Computers and Informatics
web usage mining
access log file
data collection
data preprocessing
title Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis
title_full Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis
title_fullStr Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis
title_full_unstemmed Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis
title_short Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis
title_sort data collection and preprocessing in web usage mining implementation and analysis
topic web usage mining
access log file
data collection
data preprocessing
url https://ijci.uoitc.edu.iq/index.php/ijci/article/view/486
work_keys_str_mv AT mohammedalimohammed datacollectionandpreprocessinginwebusageminingimplementationandanalysis
AT rulaahamid datacollectionandpreprocessinginwebusageminingimplementationandanalysis
AT reemrazzaqabdulhussein datacollectionandpreprocessinginwebusageminingimplementationandanalysis