‹ Back to the shop

SVM-based Web Content Mining with Leaf Classification Unit from DOM-tree

Product Description

SVM-based Web Content Mining with Leaf Classification Unit from DOM-tree

Abstract-In order to analyze a news article dataset, we first extract important information such as title, date, and paragraph of the body. At the same time, we remove unnecessary information such as image, caption, footer, advertisement, navigation and recommended news. The problem is that the formats of news articles are changing according to time and also they vary according to news source and even section of it. So, it is important for a model to generalize when predicting unseen formats of news articles. We confirmed that a machine learning based model is better to predict new data than a rule-based model by some experiments. Also, we suggest that noise information in the body possibly can be removed because we define a classification unit as a leaf node itself. On the other hand, general machine learning based models cannot remove noise information. Since they consider the classification unit as an intermediate node which consists of the set of leaf nodes, they cannot classify a leaf node itself.< final year projects >

Including Packages

Our Specialization

Support Service

CUSTOMER SUPPORT

Call us +91 967-778-1155

HAPPY CUSTOMERS

Read the testimonials

LATEST NEWS

enjoy our blog

Statistical Report

satisfied customers

3,589

Freelance projects

983

sales on Site

11,021

developers

175+

Additional Information

Domains	Datamining
Programming Language	Java

Copyright myprojectbazaar 2020

Terms & Conditions // Disclaimer // Cancellation & Refund // Privacy Policy // Shipping and Delivery