Services Products Purchase Free Trial Partner Support Company Contact LoopIP Home
Introduction Technology Directory Spider Indexer Wiki Tags User

NRS Indexer Overview

NRS contains a full-text indexing engine. All words found on spidered pages are indexed and ranked. Ranking is determined by a proprietary algorithm based on the document title, document url, document text, and the popularity of the document given by links to it.

The indexer can reindex documents in the background and swap the new index for the old on the fly providing 24/7 operation. The indexer can also operate incrementally and add to the index just the documents last spidered.

The indexer creates dual indices per word index, one with all results per word, and another with just the top 1000 results. When searching for a single term, the second index provides faster performance.

Federated search allows searching into multiple indexes spread across a cluster of servers and disks with the same performance characteristics as searching into a single index. In fact the performance can be higher because the retrieval of the search hits is distributed across multiple disks. Using new solid state technology as found in Compact Flash cards and SSD disks, performance on the order of 100 searches a second is possible.

Collections

Collections provide spidering and indexing instructions for a directory, a website, or user contributed listings. You can specify the following options:

  • Index forms: Index any forms such as search boxes as part of the document
  • Index links: Index the links of the document to activate the "link:url" search feature
  • Index all metadata: Index the metadata tags and make them searchable with "meta-name:term" feature
  • Skip 404: Don't index page not found links even if directory listing metadata is available for it
  • Index images: Spider and index images. Create thumbnails. Skip images of certain sizes and characteristics
  • Rank boost: Boost the rank of a collection
  • Popularity: Apply document popularity algorithm to the collection

Metadata

Metadata lets you specify rules to extract name-value pairs from documents, titles, urls, wiki pages, directory listings, metatags, and other fields. Metadata can be used to create parametric searches, shopping comparison search-engines, job board searches, classified searches, local searches, and more. The rules can contain regexp search expressions, templatized searches, and other powerful content extraction methods.

   
Indexer Stats

  • 10 million pages per index
  • federated search against billions of pages
  • international character support
  • word stemming
  • document types: html, text, images, wiki, pdf
  • external document converter API
 
LoopIP search
Web search
Net Research Server
Net Research Server - demonstration website
visit Net Research Server - demonstration website
Demo Links
Web Search
Shopping Engine
Local Search
Directory
Metasearch
Enterprise
Wiki
Integration

Copyright © 2008 LoopIP LLC. All rights reserved | Terms | Privacy