Product Description
SmartCrawler: A Two-Stage Crawler
for Efficiently Harvesting Deep-Web Interfaces
Abstract— As deep web grows at a very fast pace, there has been increased interest in techniques that help efficiently locate deep-web interfaces. However, due to the large volume of web resources and the dynamic nature of deep web, achieving wide coverage and high efficiency is a challenging issue. We propose a two-stage framework, namely SmartCrawler, for efficient harvesting deep web interfaces. In the first stage, SmartCrawler performs site-based searching for center pages with the help of search engines, avoiding visiting a large number of pages. To achieve more accurate results for a focused crawl, SmartCrawler ranks websites to prioritize highly relevant ones for a given topic. In the second stage, SmartCrawler achieves fast in-site searching by excavating most
relevant links with an adaptive link-ranking. To eliminate bias on visiting some highly relevant links in hidden web directories, we design a link tree data structure to achieve wider coverage for a website. Our experimental results on a set of representative domains show the agility and accuracy of our proposed crawler framework, which efficiently retrieves deep-web interfaces from large-scale sites and achieves higher harvest rates than other crawlers. < final year projects >
Including Packages
Our Specialization
Support Service
Statistical Report
![Smartcrawler: A Two-Stage Crawler For Efficiently Harvesting Deep-Web Interfaces 4 110](https://myprojectbazaar.com/wp-content/uploads/2013/12/110.jpg)
satisfied customers
3,589![Smartcrawler: A Two-Stage Crawler For Efficiently Harvesting Deep-Web Interfaces 5 25](https://myprojectbazaar.com/wp-content/uploads/2013/12/25.jpg)
Freelance projects
983![Smartcrawler: A Two-Stage Crawler For Efficiently Harvesting Deep-Web Interfaces 6 311](https://myprojectbazaar.com/wp-content/uploads/2013/12/311.jpg)
sales on Site
11,021![Smartcrawler: A Two-Stage Crawler For Efficiently Harvesting Deep-Web Interfaces 7 41](https://myprojectbazaar.com/wp-content/uploads/2013/12/41.jpg)
developers
175+