Creating Indexes

Web Search creates two types of indexes:

There are two forms you can use to create an each type of index: the standard form and the advanced form.

For example, the Define Crawled Index is the standard form for creating a crawled index. But the Define Crawled Index (Advanced) form offers more options than the standard form, including options that override default virtual search server settings. Both methods are described in the following sections.


Searching across Multiple Indexes

Web Search can search across multiple indexes within a single virtual search server. However, searching a single index is generally faster than searching across multiple indexes.

HINT:  While you can search across multiple indexes within a single virtual search server, you cannot search across multiple virtual search servers.


Restricting Search Results to Specific Areas

You can restrict search results to specific areas of your file or Web server in the following ways:

HINT:  Using the last option requires that indexed documents contain summary fields such as META tags. This option works for almost any file format that contains document summary fields, including HTML, XML, PDF, Word*, and WordPerfect*.

For information about preventing Web Search from indexing specific content, see Excluding Documents from Being Indexed.


Defining a New Crawled Index


Using the Define Crawled Index Page

  1. From the Web Search Manager Global Settings page, click Manage in the row of the virtual search server that you want to work with.

  2. Under Define a New Index, click New Crawled Index > Define Index.

  3. In the Index Name field, enter a name for your index.

    HINT:  A name can be a word, phrase, or a numeric value. If the virtual search server you are working on contains, or will contain, a large number of indexes, you might want to utilize a numbering scheme to help you manage multiple indexes more effectively. But keep in mind that the name you enter here appears on the default search page. So you might want to choose a name that can be understood by users of your search services.

  4. Under Web Sites to Crawl, type the URL of the Web site that you want indexed.

    You can enter just the URL, such as www.mycompany.com, or you can also append a complete path, down to the file level, such as www.mycompany.com/path/index.html.

  5. If desired, add another URL.

  6. To add additional URLs, click Add More URLs.

  7. Click Apply Settings.


Using the Define Crawled Index (Advanced) Page

The Define Crawled Index (Advanced) page offers some additional options beyond those available in the standard Define Crawled Index page. Changes made using this page will override default virtual search server settings.

  1. From the Web Search Manager Global Settings page, click Manage in the row of the virtual search server that you want to work with.

  2. Under Define a New Index, click New Crawled Index > Define Index.

  3. On the Define Crawled Index page, click Advanced Index Definition.

  4. In the Index Name field, enter a name for your new index.

    HINT:  A name can be a word, phrase, or a numeric value. If the virtual search server you are working on contains, or will contain, a large number of indexes, you might want to utilize a numbering scheme to help you manage multiple indexes more effectively. But keep in mind that the name you enter here appears on the default search page. So you might want to choose a name that can be understood by users of your search services.

  5. In the Index Description field, enter an optional description of the index to be created.

  6. Under Web Sites to Crawl, enter the URL of the Web site to be indexed.

    HINT:  If you enter a filename at the end of the URL, then just that file will be indexed.

  7. In the Subdirectories to Exclude text box, type the directories that you want Web Search not to index.

    For example, /marketing or /sales/doc.

  8. To direct Web Search to include or exclude specific file types, click Extensions to Include or Extensions to Exclude and then enter the extensions, separating each one with a single space, such as HTM PDF TXT.

  9. To add additional URLs, click Define More Web Sites.

  10. To delete a URL, click Remove Web Site.

  11. In the Additional URLs text box, enter any other URLs that you want indexed.

    For example, www.mycompany.com/marketing.

    This allows you to specify additional pockets of information found on other Web sites, but not include all of the content of those sites to your searches.

    HINT:  When Web Search encounters links found in the pages of Additional URLs that point to pages specified in Web Sites to Crawl, Web Search follows those links. All other links that go outside of Web Sites to Crawl are not followed.

  12. Under Additional Settings, enter the absolute path to where you want the index files stored in the Location of Index Files field.

    For example, volume:\searchroot\sites\mysites.

    By default, index files are stored at volume:\searchroot\sites\default\indexes\.

    HINT:  Changes made to Additional Settings override Default Settings.

  13. From the Encoding (If Not in META Tags) drop-down list, select the encoding to be used by files being indexed that do not contain an encoding specification.

  14. In the Maximum File Size to Index field, enter the maximum file size (in bytes) that Web Search should index.

    Files exceeding this size will not be indexed and therefore, will not be included in search results.

  15. In the Maximum Time to Download a URL field, enter a number (in seconds) before Web Search automatically skips the indexing of the specified URL.

  16. To direct Web Search to pay attention to case of filenames and directory names, click Yes next to URLs that are Case Sensitive.

  17. To direct Web Search to crawl dynamic content (URLs containing the question mark [?]), click Yes next to Crawl Dynamic URLs.

    HINT:  For more information about indexing dynamic content, see About Indexing Dynamic Web Content.

Once you define an index, you must generate it to make it searchable. See Generating Indexes.


Defining a New File System Index


Using the Define File System Index Page

  1. From the Web Search Manager Global Settings page, click Manage in the row of the virtual search server that you want to work with.

  2. Under Define a New Index, click New File System Index > Define Index.

  3. In the Index Name field, enter a name for your index.

    HINT:  A name can be a word, phrase, or a numeric value. If the virtual search server you are working on contains, or will contain, a large number of indexes, you might want to utilize a numbering scheme to help you manage multiple indexes more effectively. But keep in mind that the name you enter here appears on the default search page. So you might want to choose a name that can be understood by users of your search services.

  4. In the Server Path to be Indexed field, enter the absolute path to the folder containing the information that you want indexed.

    For example, SYS:\SALES\REPORTS.

  5. In the Corresponding URL Prefix field, enter the URL that should be used by the search results page to access the individual files.

    For example, /SALES.

    HINT:  For information about defining a URL prefix in the NetWare Enterprise Web Server, see Setting Additional Document Directories.

  6. To add additional paths, click Add More Paths.

  7. Click Apply Settings.

Once you define an index, you must generate it to make it searchable. See Generating Indexes.


Using the Define File System Index (Advanced) Page

  1. From the Web Search Manager Global Settings page, click Manage in the row of the virtual search server that you want to work with.

  2. Under Define a New Index, click New Crawled Index > Define Index.

  3. On the Define File System Index page, click Advanced Index Definition.

  4. In the Index Name field, enter a name for your new index.

    HINT:  A name can be a word, phrase, or a numeric value. If the virtual search server you are working on contains, or will contain, a large number of indexes, you might want to utilize a numbering scheme to help you manage multiple indexes more effectively. But keep in mind that the name you enter here appears on the default search page. So you might want to choose a name that can be understood by users of your search services.

  5. In the Index Description field, enter an optional description of the index to be created.

  6. In the Location of Index Files field, enter the absolute path to where you want the index files stored.

    For example, SYS:\NSearch\sites\mysites.

    By default, index files are stored at volume:\searchroot\sites\site_name\ indexes\.

  7. From the Encoding (If Not in META Tags) drop-down list, select the encoding to be used when indexeing files that do not contain an encoding specification.

    For example, HTML files can specify their encoding with a Content-Type META tag.

  8. In the Maximum File Size to Index field, enter the maximum file size (in bytes) that Web Search should index.

    Files exceeding this size will not be indexed and therefore, will not be included in search results.

  9. Under Path Information, type the absolute path to the folder containing the information that you want indexed in the Server Path field. For example, SYS:\SALES\REPORTS.

  10. In the Corresponding URL Prefix field, enter the URL that should be used by the search results page to access the individual files.

    For example, /SALES.

    HINT:  For information about defining a URL prefix in the NetWare Enterprise Web Server, see Setting Additional Document Directories.

  11. To exclude specific subdirectories from being indexed, enter their relative paths in the Subdirectories to Exclude field.

  12. To direct Web Search to include or exclude specific file types, click Extensions to Include or Extensions to Exclude and then type the extensions, separating each one with a single space, such as HTM PDF TXT.

  13. To add additional paths, click Define More Paths.

  14. To delete a path, click Remove Path.

  15. Click Apply Settings.

Once you define an index, you must generate it to make it searchable. See Generating Indexes.


Generating Indexes

Once you define an index, you must generate it before it can be used for searching. Generating an index is the actual process where Web Search Server examines file server or Web server content, gathers keywords, titles, and descriptions and then includes them in the index.

  1. From the Web Search Manager Global Settings page, click Manage in the row of the virtual search server that you want to work with.

  2. Click Generate in the Action column of the index that you want to work with.

    The Active Jobs screen indicates the status of the current indexing jobs. When there is no current index job, the status page will read No indexing jobs are currently running or defined.

  3. To cancel the current indexing jobs, click Cancel in the Status column.

You can direct Web Search to automatically update your indexes on specific dates and at specific times by scheduling events. For more information, see Automating Index and Server Maintenance.



Previous | Next