Robot FAQ for Net Research Server
Why is Net Research Server grabbing webpages from my website?
NRS crawls pages all over the Internet in order to build a full-text search engine.
I do not want my website to be crawled, what should I do?
You can stop the NRS robot by adding an entry in your robots.txt file. Add the following lines to a robots.txt file on
the root of your webserver:
User-Agent: NetResearchServer
Disallow: /
NRS will also follow instructions on HTML pages with a metatag with a name "robots" and values "nocrawl", "noindex", "nofollow", "noarchive", "nosnippet". Also any anchor link with an attribute rel="nofollow" will not be followed.
Why does NRS try to access some non-existing URLs from my website?
NRS crawls the web by using the links found in the Open Directory Project. ODP does check to make sure only valid pages are listed in its directory,
but sometimes a few can slip by.
How frequent does NRS access web pages from a server?
NRS typically refreshes its index about once a month. When visiting a website, pages are retrieved no faster than 1 page every 30 seconds.
|