Services Products Purchase Free Trial Partner Support Company Contact LoopIP Home
Introduction Technology Directory Spider Indexer Wiki Tags User

NRS Technology Overview

Feature SetDescription
Full-text Search Engine
Indexes the web
  • spider and index up to 10 million pages per seat
  • federated search lets you search multiple seats to create multi-billion document indexes
  • supports html, pdf, and text document types
  • supports other document types using external conversion script
  • supports https and secure sites
  • supports image search: gif, jpg, png, bmp, tga, pcx
  • supports cookies, robot compliant
  • specify crawl instructions for directory categories
  • crawl websites and directory categories
  • crawl user submitted sites and links
  • create metadata rules or assign metadata to directory listings and categories
  • export new links from document data to a CSV directory import file
User Accounts
Submit url listings
  • submit url listings
  • specify url categories, premium keywords, title, description
  • free or paid accounts
  • toolbar with popup blocker
  • page and search alerts
  • bookmarking tool with search and tag clouds
  • custom spidering rules for url listings
  • spider and index urls immediately, by schedule, or after review
  • rich backend user administration
Directory Engine
  • imports over 500,000 categories and 5 million pages from ODP
  • add any number of metadata fields to categories and listings
  • complete search and browse functionality
  • users can purchase listings and keywords
  • customize content and presentation
  • use your own directory using ODBC import, file import, or RDF import
  • export directory to ODBC, or file
Metasearch Engine
Searches web and intranet search engines seamlessly
  • crawls the web to discover over millions search forms
  • aggregates specialty web and intranet database searches
  • supports metadata extraction
  • let users create their own metasearch sets
  • paid search result feed aggregation
Wiki Engine
Import from wikimedia or start your own
  • supports millions of pages
  • indexes pages after editing for immediate searchability
  • support templates for complex rendering
  • control page editing access
  • incorporate Wikipedia extracts onto search result pages
Tag Engine
Bookmarks and tags
  • import user bookmarks
  • use NRS toolbar for bookmarking button
  • share or keep private
  • tag clouds and tag lists
Intelligent Agent
Monitors searches and pages
  • notifies you when new search results appear in saved searches
  • deep query functionality lets you specify complex expressions
  • create reports that aggregate alerts. make the reports public so other users also be notified.
Mail Engine
Organizes and shares your results
  • save search results
  • organize alerts in folders
  • e-mail content to your buddy lists
Plugin Engine
Use plugins for extra functionality
  • develop your own plugin
  • tie in to users and crawling
  • full-text search
HTML Administration
Gives complete customization and application building
  • purchase as you go more documents, users, and support
  • build complete research applications
  • administer users, templates, server settings
  • view web logs
  • XSL editor
  • manage server seats: crawling/indexing/databases
Web Services & Database SDK
Provides API functionality to develop apps
  • .NET, Java, C++, Perl database interfaces
  • integrate search and tracking into your app using XML services

NRS is further composed of the following architectural components:

Base ServicesDescription
WEBSERVER By incorporating a webserver, NRS does not need to rely on external webservers, simplifying installation and integration. NRS serves pages as XML or HTML, so if you desire to use your webserver, NRS can act as your content provider in XML format. For robust security, NRS supports HTTPS mode and self-issued SSL certificates.
XSLT ENGINE All pages are generated using an XSLT engine that transforms XML into HTML using an XSL stylesheet. NRS lets you build your own XSL templates for total control over presentation. Through an administration interface, content can be aggregated as one XML stream. NRS uses the Open Source Sablotron XSLT engine, or, if detected, can also use Microsoft's MSXML engine on the Windows platform.
CRAWLER An industry conforming crawler and caching engine provides excellent performance for metasearches, content aggregation, and background crawling to refresh search indexes. Supports http,https, and cookies.
INDEXER A scalable full-text indexer can index 3 million pages in 24 hours. NRS builds indexes that feature for every word occurance a popularity ranking score and category information.
WIKI A wiki rendering engine allows for the development of complex Wiki templates and full-text search.
MAIL NRS features SMTP, POP3, mail send, and HTML client modules for a comprehensive mail solution. Mailboxes let you organize agent notification e-mails into folders, and receive trade newsletters.
DATABASE Object-relational database technology provides a performance edge by optimizing all metadata relationships between objects. The application database engine is powered by Gigabase, a database that scales to terabytes.

Feature Overview

Feature Description
DIRECTORY    
  - Schedule Schedule directory updates at specific times
  - Import Import from ODP, ODBC, or from file
  - Export Export to ODBC or file
  - Fields Add extra fields to listings
  - Shadow Updates Import new directory in background and swap out with old one
CRAWLING    
  - Schedule Schedule crawler to run at specific times
  - Protocols Supports http, https, and cookies
  - Documents Supports html,text,pdf, images, and other filetypes through an external converter script
  - Languages Supports all languages
  - Stay on site Keep crawler on site (includes subdomains)
  - Stay on path Keep crawler on same URL path
  - Follow cgi Control access to dynamic URLs
  - Index forms Index search engine forms found on pages
  - Index links Allow "link:url" query which returns urls pointing to page
  - Robot check Optionally check robot.txt file
  - Depth Maximum directory depth to crawl
  - Distance Maximum hops crawler can make from start URL
  - Max pages Maximum pages to crawl for given site
  - Politeness How long to wait between page retrievals to same domain
  - History Number of historical versions to keep of a page
  - Excludes Prevent crawler access to directories, pages, or domains
  - Includes Allow crawler access to a directory, page, or domain
  - Refresh rate How many days between recrawls of a document
INDEXING    
  - Popularity Activate link analysis to determine URL popularity ranking
  - Incremental Incrementally index newly spidered pages
  - Metadata Create rules to extract metadata from full-text document, directory fields, metatags, and other sources. Use regexp or templatized expressions.
  - Rank boost Boost rank of a given collection
SEARCHING    
  - Operators AND, OR, ANDNOT, -, +, ""
  - Duplicates Filter out duplicates
  - Collections Search across one or more collection
  - Metadata Filter search results by selecting metadata values, and sort results by metadata value
  - Remote Search a remote index on another server
  - Highlighting Customize highlighting
  - Cached View cached page with highlighting
  - Historical View differences between historical versions of the document
METASEARCH    
  -Discovery Enter url of search page, NRS will retrieve page and import search box definition
  -Fields
Provide a name and description for each search
  -Streaming Control streaming to browser. Send search results to user browser as they come in or once they are all in.
  -Robot Follow robots.txt before using search resource
  -Autoparse Extract search results automatically without a parse rule
  -Parse rule Provide a parse rule to extract specific metadata fields in the search results
  -Parameter override Provide new parameters to the search engine.
  -Input fiddling Modify query before retrieving search results
  -Cache Control length of time to cache search results from a source
NEWS    
  -Parse rule Provide parse rule to extract news titles, descriptions, and other fields.
  -Display options Specify when to render news: home page, query page,..
LINKS    
  -Related links Provide a list of related links to display on search/news page
XML    
  -Aggregation Aggregate XML streams and transform each to standardize output
ODBC    
  -SQL Query Incorporate SQL query results as XML into the template
INDEX    
  -Search results Incorporate other search result xml streams from main index
SEARCHING    
  -Highlighting Customize highlighting
HOSTING    
  -UI access Manage collections and templates from HTML UI
  -Templates Create templates that provide functionality and HTML customization
  -Applications Organize templates into grouped categories and subcategories in a tabbed HTML interface
  -Host Run server under a hostname
  -Virtual host Provide subdomain redirection for multiple solution hosting off one server
SDK  
  - .NET Easy to use .NET client to database
  - Data access Access database using Perl/C++/C/Java or command-line tools
  - XML Get XML feeds.

See Also:

Search cluster architecture
Application architecture



 
LoopIP search
Web search
Net Research Server
Net Research Server - demonstration website
visit Net Research Server - demonstration website
Demo Links
Web Search
Shopping Engine
Local Search
Directory
Metasearch
Enterprise
Wiki
Integration

Copyright © 2008 LoopIP LLC. All rights reserved | Terms | Privacy