Information extraction from massive Web pages based on node property and text content
To address the problem of extracting valuable information from massive Web pages in big data environments,a novel information extraction method based on node property and text content for massive Web pages was put forward.Web pages were converted into a document object model (DOM) tree,and a pruning...
Saved in:
| Main Authors: | Hai-yan WANG, Pan CAO |
|---|---|
| Format: | Article |
| Language: | zho |
| Published: |
Editorial Department of Journal on Communications
2016-10-01
|
| Series: | Tongxin xuebao |
| Subjects: | |
| Online Access: | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2016190/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
MapReduce based big data framework using associative Kruskal poly Kernel classifier for diabetic disease prediction
by: R. Ramani, et al.
Published: (2025-06-01) -
An efficient parallel DCNN algorithm in big data environment
by: Yimin Mao, et al.
Published: (2025-05-01) -
A Parallel ETL Tool Based on an Improved Chain-MapReduce Framework
by: Bin Wu, et al.
Published: (2013-12-01) -
MP-SPILDL: A Massively Parallel Inductive Logic Learner in Description Logic
by: Eyad Algahtani
Published: (2024-01-01) -
CC-MRSJ:Cache Conscious Star Join Algorithm on Hadoop Platform
by: Guoliang Zhou, et al.
Published: (2013-10-01)