Web News Data Extraction Technology Based on Text Keywords

In order to shorten the time for users to query news on the Internet, this paper studies and designs a network news data extraction technology, which can obtain the main news information through the extraction of news text keywords. Firstly, the TF-IDF keyword extraction algorithm, TextRank keyword...

Full description

Saved in:
Bibliographic Details
Main Author: Kun Zhang
Format: Article
Language:English
Published: Wiley 2021-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2021/5529447
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In order to shorten the time for users to query news on the Internet, this paper studies and designs a network news data extraction technology, which can obtain the main news information through the extraction of news text keywords. Firstly, the TF-IDF keyword extraction algorithm, TextRank keyword extraction algorithm, and LDA keyword extraction algorithm are analyzed to understand the keyword extraction process, and the TF-IDF algorithm is optimized by Zipf’s law. By introducing the idea of model fusion, five schemes based on waterfall fusion and parallel combination fusion are designed, and the effects of the five schemes are verified by experiments. It is found that the designed extraction technology has a good effect on network news data extraction. News keyword extraction has a great application prospect, which can provide the basis for the research fields of news key phrases, news abstracts, and so on.
ISSN:1076-2787
1099-0526