Design and Implementation of an Intelligent Web Service Agent Based on Seq2Seq and Website Crawler

This paper proposes using a web crawler to organize website content as a dialogue tree in some domains. We build an intelligent customer service agent based on this dialogue tree for general usage. The encoder-decoder architecture Seq2Seq is used to understand natural language and then modified as a...

Full description

Saved in:
Bibliographic Details
Main Authors: Mei-Hua Hsih, Jian-Xin Yang, Chen-Chiung Hsieh
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/15/12/818
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850050497760722944
author Mei-Hua Hsih
Jian-Xin Yang
Chen-Chiung Hsieh
author_facet Mei-Hua Hsih
Jian-Xin Yang
Chen-Chiung Hsieh
author_sort Mei-Hua Hsih
collection DOAJ
description This paper proposes using a web crawler to organize website content as a dialogue tree in some domains. We build an intelligent customer service agent based on this dialogue tree for general usage. The encoder-decoder architecture Seq2Seq is used to understand natural language and then modified as a bi-directional LSTM to increase the accuracy of the polysemy cases. The attention mechanism is added in the decoder to improve the problem of accuracy decreasing as the sentence grows in length. We conducted four experiments. The first is an ablation experiment demonstrating that the Seq2Seq + Bi-directional LSTM + Attention mechanism is superior to LSTM, Seq2Seq, Seq2Seq + Attention mechanism in natural language processing. Using an open-source Chinese corpus for testing, the accuracy was 82.1%, 63.4%, 69.2%, and 76.1%, respectively. The second experiment uses knowledge of the target domain to ask questions. Five thousand data from Taiwan Water Supply Company were used as the target training data, and a thousand questions that differed from the training data but related to water were used for testing. The accuracy of RasaNLU and this study were 86.4% and 87.1%, respectively. The third experiment uses knowledge from non-target domains to ask questions and compares answers from RasaNLU with the proposed neural network model. Five thousand questions were extracted as the training data, including chat databases from eight public sources such as Weibo, Tieba, Douban, and other well-known social networking sites in mainland China and PTT in Taiwan. Then, 1000 questions from the same corpus that differed from the training data for testing were extracted. The accuracy of this study was 83.2%, which is far better than RasaNLU. It is confirmed that the proposed model is more accurate in the general field. The last experiment compares this study with voice assistants like Xiao Ai, Google Assistant, Siri, and Samsung Bixby. Although this study cannot answer vague questions accurately, it is more accurate in the trained application fields.
format Article
id doaj-art-2e6e44f0dd0c415fbf6879405f9f0d84
institution DOAJ
issn 2078-2489
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Information
spelling doaj-art-2e6e44f0dd0c415fbf6879405f9f0d842025-08-20T02:53:26ZengMDPI AGInformation2078-24892024-12-01151281810.3390/info15120818Design and Implementation of an Intelligent Web Service Agent Based on Seq2Seq and Website CrawlerMei-Hua Hsih0Jian-Xin Yang1Chen-Chiung Hsieh2Department of Product Design, School of Arts and Design, Sanming University, Sanming 365004, ChinaQSAN Technology, Inc., Taipei City 114, TaiwanDepartment of Computer Science and Engineering, Tatung University, Taipei City 104, TaiwanThis paper proposes using a web crawler to organize website content as a dialogue tree in some domains. We build an intelligent customer service agent based on this dialogue tree for general usage. The encoder-decoder architecture Seq2Seq is used to understand natural language and then modified as a bi-directional LSTM to increase the accuracy of the polysemy cases. The attention mechanism is added in the decoder to improve the problem of accuracy decreasing as the sentence grows in length. We conducted four experiments. The first is an ablation experiment demonstrating that the Seq2Seq + Bi-directional LSTM + Attention mechanism is superior to LSTM, Seq2Seq, Seq2Seq + Attention mechanism in natural language processing. Using an open-source Chinese corpus for testing, the accuracy was 82.1%, 63.4%, 69.2%, and 76.1%, respectively. The second experiment uses knowledge of the target domain to ask questions. Five thousand data from Taiwan Water Supply Company were used as the target training data, and a thousand questions that differed from the training data but related to water were used for testing. The accuracy of RasaNLU and this study were 86.4% and 87.1%, respectively. The third experiment uses knowledge from non-target domains to ask questions and compares answers from RasaNLU with the proposed neural network model. Five thousand questions were extracted as the training data, including chat databases from eight public sources such as Weibo, Tieba, Douban, and other well-known social networking sites in mainland China and PTT in Taiwan. Then, 1000 questions from the same corpus that differed from the training data for testing were extracted. The accuracy of this study was 83.2%, which is far better than RasaNLU. It is confirmed that the proposed model is more accurate in the general field. The last experiment compares this study with voice assistants like Xiao Ai, Google Assistant, Siri, and Samsung Bixby. Although this study cannot answer vague questions accurately, it is more accurate in the trained application fields.https://www.mdpi.com/2078-2489/15/12/818natural language processingintelligent customer service agentwebsite crawlerdeep learningLSTMSeq2Seq
spellingShingle Mei-Hua Hsih
Jian-Xin Yang
Chen-Chiung Hsieh
Design and Implementation of an Intelligent Web Service Agent Based on Seq2Seq and Website Crawler
Information
natural language processing
intelligent customer service agent
website crawler
deep learning
LSTM
Seq2Seq
title Design and Implementation of an Intelligent Web Service Agent Based on Seq2Seq and Website Crawler
title_full Design and Implementation of an Intelligent Web Service Agent Based on Seq2Seq and Website Crawler
title_fullStr Design and Implementation of an Intelligent Web Service Agent Based on Seq2Seq and Website Crawler
title_full_unstemmed Design and Implementation of an Intelligent Web Service Agent Based on Seq2Seq and Website Crawler
title_short Design and Implementation of an Intelligent Web Service Agent Based on Seq2Seq and Website Crawler
title_sort design and implementation of an intelligent web service agent based on seq2seq and website crawler
topic natural language processing
intelligent customer service agent
website crawler
deep learning
LSTM
Seq2Seq
url https://www.mdpi.com/2078-2489/15/12/818
work_keys_str_mv AT meihuahsih designandimplementationofanintelligentwebserviceagentbasedonseq2seqandwebsitecrawler
AT jianxinyang designandimplementationofanintelligentwebserviceagentbasedonseq2seqandwebsitecrawler
AT chenchiunghsieh designandimplementationofanintelligentwebserviceagentbasedonseq2seqandwebsitecrawler