Services Products Purchase Free Trial Partner Support Company Contact LoopIP Home
Introduction Technology Directory Indexer Indexer Wiki Tags User

NRS Spider Overview

NRS contains spidering technology that will follow links found on website pages and retrieve them. The spider is controlled by creating collections that define the spidering parameters. A collection can be created for a directory category of listings, a website, for an external plugin, or for use contributed listings. The collection specifies settings such as:

  • follow robots.txt rules
  • stay on site: allow subdomains, only this domain, any domain
  • stay on path
  • follow cgi: specify number of variables allowed
  • max directory depth: specify maximum depth of directory allowed
  • max hops: specify maximum number of hops the spider can make from the start page
  • max pages per website
  • politeness: how long to wait before retrieving another page from the same website
  • history: how many versions of the page to keep. NRS lets you browse the page history and view the differences between the versions.
  • excludes: specify URL exclude patterns
  • includes: specify URL include patterns
  • refresh rate: specify how long to wait before revisiting the webpage
  • user agent: how does your spider identify itself
The spider can be set to operate on a schedule and reindex automatically afterwards.

The spider supports HTML, PDF, and image formats. Other formats are supported through an external converter script. You can configure the script to provide you with an HTML version of the document for indexing and metadata purposes.

Metasearch and RSS

The NRS spider is also used for the following template features of NRS:

  • metasearch: Retrieve search results from other search engines. Automatically or with the help of a rule, extract the results from the page.
  • news: Retrieve RSS feeds which can be cached and displayed together on a page.
  • xml feeds: Aggregate paid search result feeds, or other sources of information. The XML can be transformed using XSLT to unify the schema of various XML sources.
The NRS templating engine lets you aggregate Xml/Html/Rss feeds, giving you all the tools you need to build a portal page for a specific knowledge seeker. The return on investment is immediate given the number of hours that can be saved on research activities.


 
LoopIP search
Web search
Net Research Server
Net Research Server - demonstration website
visit Net Research Server - demonstration website
Demo Links
Web Search
Shopping Engine
Local Search
Directory
Metasearch
Enterprise
Wiki
Integration

Copyright © 2008 LoopIP LLC. All rights reserved | Terms | Privacy