Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis
Data collection and data preprocessing are crucial stages in web usage mining, mainly because of the unstructured, diverse, and noisy nature of log data. During data collection, log file datasets are loaded and merged. Effective and comprehensive data preprocessing plays a vital role in ensuring the...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | Arabic |
| Published: |
University of Information Technology and Communications
2024-11-01
|
| Series: | Iraqi Journal for Computers and Informatics |
| Subjects: | |
| Online Access: | https://ijci.uoitc.edu.iq/index.php/ijci/article/view/486 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846165912733876224 |
|---|---|
| author | Mohammed Ali Mohammed Rula A. Hamid Reem Razzaq AbdulHussein |
| author_facet | Mohammed Ali Mohammed Rula A. Hamid Reem Razzaq AbdulHussein |
| author_sort | Mohammed Ali Mohammed |
| collection | DOAJ |
| description | Data collection and data preprocessing are crucial stages in web usage mining, mainly because of the unstructured, diverse, and noisy nature of log data. During data collection, log file datasets are loaded and merged. Effective and comprehensive data preprocessing plays a vital role in ensuring the efficiency and scalability of algorithms used in the pattern discovery phase of web usage mining. This work aims to address these phases by introducing two innovative approaches. The first approach focuses on determining the device used for accessing the web, distinguishing between computers and mobile devices. The second approach aims to determine user sessions and complete paths by utilizing the referrer URL. The entire preprocessing pipeline has been implemented using the C# programming language, and the source code is available on GitHub at the following link: https://github.com/Mohammed91/Web-Usage-Mining. |
| format | Article |
| id | doaj-art-7479ba29dac6481cab19ccb02a34b5e0 |
| institution | Kabale University |
| issn | 2313-190X 2520-4912 |
| language | Arabic |
| publishDate | 2024-11-01 |
| publisher | University of Information Technology and Communications |
| record_format | Article |
| series | Iraqi Journal for Computers and Informatics |
| spelling | doaj-art-7479ba29dac6481cab19ccb02a34b5e02024-11-16T19:50:45ZaraUniversity of Information Technology and CommunicationsIraqi Journal for Computers and Informatics2313-190X2520-49122024-11-01502547410.25195/ijci.v50i2.486449Data Collection and Preprocessing in Web Usage Mining: Implementation and AnalysisMohammed Ali Mohammed0Rula A. Hamid1Reem Razzaq AbdulHussein2University of Information Technology and Communication University of Information Technology and CommunicationsUniversity of Information Technology and CommunicationsData collection and data preprocessing are crucial stages in web usage mining, mainly because of the unstructured, diverse, and noisy nature of log data. During data collection, log file datasets are loaded and merged. Effective and comprehensive data preprocessing plays a vital role in ensuring the efficiency and scalability of algorithms used in the pattern discovery phase of web usage mining. This work aims to address these phases by introducing two innovative approaches. The first approach focuses on determining the device used for accessing the web, distinguishing between computers and mobile devices. The second approach aims to determine user sessions and complete paths by utilizing the referrer URL. The entire preprocessing pipeline has been implemented using the C# programming language, and the source code is available on GitHub at the following link: https://github.com/Mohammed91/Web-Usage-Mining.https://ijci.uoitc.edu.iq/index.php/ijci/article/view/486web usage miningaccess log filedata collectiondata preprocessing |
| spellingShingle | Mohammed Ali Mohammed Rula A. Hamid Reem Razzaq AbdulHussein Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis Iraqi Journal for Computers and Informatics web usage mining access log file data collection data preprocessing |
| title | Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis |
| title_full | Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis |
| title_fullStr | Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis |
| title_full_unstemmed | Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis |
| title_short | Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis |
| title_sort | data collection and preprocessing in web usage mining implementation and analysis |
| topic | web usage mining access log file data collection data preprocessing |
| url | https://ijci.uoitc.edu.iq/index.php/ijci/article/view/486 |
| work_keys_str_mv | AT mohammedalimohammed datacollectionandpreprocessinginwebusageminingimplementationandanalysis AT rulaahamid datacollectionandpreprocessinginwebusageminingimplementationandanalysis AT reemrazzaqabdulhussein datacollectionandpreprocessinginwebusageminingimplementationandanalysis |