Services Products Purchase Free Trial Partner Support About LoopIP Contact Us Home

Tutorial 6 > Next tutorial

NRS Tutorial: Creating a parse rule for a search engine

This tutorial will lead you through creating a search template and populating it with search engines. The template will then let you metasearch all the engines.

Step 1: Create search template

In admin, click on the 'Templates' tab and on the add template form, select the 'search' type, and enter a name, say 'mysearch'. Click 'Add'.

Step 2: Edit search template

Click 'edit' next to the 'mysearch' template. We are now going to add a couple of search engines that already in the system. Click 'search list', select the checkbox next to:

  • Google
  • Teoma
  • Yahoo!

Next, click 'Add' at the bottom of the page, and then click 'Back to Template' at the bottom of the page.

You now have 4 searches on your template.

Step 3: Add a new search

We will now add a new search to the template that does not exist in the system. Where it says add new search enter the url: http://www.kartoo.com/en/kartoo.html

Click 'Save', the url now appears in orange indicating it has not been properly setup yet. NRS will crawl the URL immediately to retrieve all form definitions found on the page. Click 'edit' next to it.

Step 4: Specifying search properties

  • For name enter: Kartoo Metasearch
  • For description enter: Innovative metasearch engine using Flash to graphically manage search results.
  • In the drop down box that says 'All Searches': pick the first item. Sometimes many items are listed here representing all the forms found on the URL. You need to select the one you wish to use. To help, you can click the 'view' link to find out more about the forms.

Click 'Save'.

NRS features algorithms to automatically extract search results from the search results page. Sometimes NRS cannot determine the search results and a parse rule must be written. A parse rule is also needed if extra metadata information is desired for each search result. For example some search engines return metadata fields like size, date, category,...

Step 5: Test the search

Click the 'test' link. A new window opens up with your template and a list of search engines. Deselect all search engines except Kartoo. Enter a search term, for example: java, and click search. You will now get a search result page. NRS sucessfully retrieved the search results automatically. On the form that says 'Top results from Kartoo Metasearch' click 'search'. This brings you to the original Kartoo search result page. We are now going to write a parse rule to also retrieve the list of search engines each search result came from.

Step 6: Specify a parse rule

Bring back the search property window and enter into the parse rule field:

<c label="." hide>
<title link>
<desc>
<domain link>
<sources prefix="(" suffix=")">

Parse rules operate by breaking down the search result page into a list of text lines and links.

The first line looks for the search result count and identifies it with the "." character. The hide attribute says to not display this item.

The second line says the next search result item is the title and is a link. "title" is a reserved word indicating to NRS that it is the title.

The third line says the next search result item is the description and is text.

The fourth line says the next search result item is the domain and is a link.

The fifth line says the next search result item is a list of sources and can be identified with a prefix of "(" and a suffix of ")".

You can test the parse rule by clicking 'Save' and the 'test' link again. Notice now the search results also return the metadata items of domain and sources.

The trick in writing parse rules is to help NRS identify uniquely a search result. In this case by first identifying a text line with a '.' and the next item is always a link helps already narrowing down the list of options. Using label,prefix, suffix attributes are important to prevent false positives.

For more info on parse rules, consult the help system in NRS and have a look at the parse rules in the demo app.

Back to support



 
LoopIP search
Web search
Net Research Server
Net Research Server - demonstration website
visit Net Research Server - demonstration website
Demo Links
Web Search
Shopping Engine
Local Search
Directory
Metasearch
Enterprise
Wiki
Integration

Copyright © 2008 LoopIP LLC. All rights reserved | Terms | Privacy