Content Search Guide
CHAPTER 7
This chapter explains how to configure your exteNd Director Dynamic Reasoning Engine (DRE) to perform specialized search tasks.
The following topics are covered:
This section describes how to enable number searches using the DRE configuration file.
To enable searching for numbers:
In a text editor, open the DRE configuration file DirectorDRE.cfg, located at autonomy\engine in your exteNd Director installation directory.
Set the parameter INDEXNUMBERS=1.
NOTE: If this parameter does not appear, add it to the file.
Reset the DRE, as described in Resetting the DRE.
Reindex the data, as described in Programming practices.
This section describes how to configure the DRE for searching in other languages, including those that use a multibyte character set (MBCS). By default, the DRE is configured to process English-language data.
Configure your search environment to import multibyte character set (MBCS) and other binary formats, as described in Importing MBCS and other binary formats below.
Set language-specific configuration parameters, as described in Modifying language-specific configuration parameters.
(Optional) Copy sentence-breaking files into the directory where the DRE resides, as described in Providing sentence-breaking files (optional).
Reset the DRE, as described in Resetting the DRE.
Reindex the data, as described in Forcing indexing.
This section describes how to configure your search environment to import multibyte character set (MBCS) and other binary formats into the DRE for indexing.
You enable MBCS support by configuring Autonomy Omnislave, a plug-in module that converts data from binary file formats so it can be indexed in the DRE.
The Omnislave configuration file is called omnislave.cfg and resides in autonomy\OmniSlaves in your exteNd Director installation directory. The omnislave.cfg file contains two types of sections:
In the following example, formats are defined for Word, RTF, and Microsoft PowerPoint files:
[Configuration] OmniConvertExtns0=*.doc OmniConvertLibraryCsvs0=wpconvdll.dll,wordconv.dll,rtfconv.dll OmniConvertConfigSectionCsvs0=WordPerfect,MSWord,Rtf OmniConvertExtns1=*.rtf OmniConvertLibraryCsvs1=rtfconv.dll OmniConvertConfigSectionCsvs1=Rtf OmniConvertExtns2=*.ppt OmniConvertLibraryCsvs2=pptconv.dll OmniConvertConfigSectionCsvs2=Ppt Logging=0 LogAppend=TRUE LogMaxKBytes=500 [MSWord] OutputCharSet=ASCII [Rtf] [Ppt] OutputCharSet=ASCII StopList=pptconv.dat
To enable support for MBCS and other binary formats:
Create [
<file_format>]
sections for each of the file formats you want Omnislave to convert for indexing.
In each [
<file_format>]
section, add a parameter OutputCharSet and set it to the character set to which you want to convert the file format.
Choose one of these character set constants:
For example, if you want to search a Word document in traditional Chinese, add the following lines of code under the appropriate [CONFIGURATION]
section in the Omnislave configuration file:
[MSWord] OutputCharSet=CHINESETRADITIONAL
You modify language-specific search parameters in the DRE configuration file DirectorDRE.cfg, located at autonomy\engine\ in your exteNd Director installation directory.
To modify language-specific parameters in the DRE:
Set the CharConv parameter to the language you want the DRE to use:
Set the TermSize parameter to specify the maximum number of characters for any term in the DRE:
Language |
Value |
---|---|
English and European languages |
10 (default) |
German |
30 |
Japanese |
30 |
Korean |
40 |
(Optional) Set the StripLanguage parameter to select which language to use when stripping terms to their stems (for example, stripping running to run):
NOTE: Use the advanced settings for English (6) and German (9) when possible. Exception: if you set the StripLanguage to 0 or 1 for English or 3 for German when you indexed content into the DRE, you must use those same settings when you send queries to the DRE.
When you use languages that do not separate words with spaces, you must specify appropriate delimiters. exteNd Director provides language-specific sentence-breaking files on your product CD that you must copy into the directory where the DRE residesautonomy\engine in your exteNd Director install directory. The following sections describe the sentence-breaking files and associated DRE configuration settings required for languages that do not delimit words with spaces.
The required sentence-breaking files are:
Platform |
Sentence-breaking files |
Location on CD |
---|---|---|
NT |
Autonomy\MBCS\chinese_nt_1_0_3.zip |
|
UNIX |
Autonomy/MBCS/chinese_solaris_1_0_3.tar.Z |
The required language-specific configuration settings are:
DRE configuration parameter |
Value |
---|---|
CharConv |
5 |
TermSize |
40 |
StripLanguage |
2 |
The required sentence-breaking files are:
Platform |
Sentence-breaking files |
Location on CD |
---|---|---|
NT |
Autonomy\MBCS\chinese_nt_1_0_3.zip |
|
UNIX |
Autonomy/MBCS/chinese_solaris_1_0_3.tar.Z |
The required language-specific configuration settings are:
DRE configuration parameter |
Value |
---|---|
CharConv |
4 |
TermSize |
40 |
StripLanguage |
2 |
The required sentence-breaking files are:
Platform |
Sentence-breaking files |
Location on CD |
---|---|---|
NT |
Autonomy\MBCS\japanese_nt_2_0_5.zip |
|
UNIX |
Autonomy/MBCS/japanese_solaris_2_0_5.tar.Z |
The required language-specific configuration settings are:
DRE configuration parameter |
Value |
---|---|
CharConv |
1 |
TermSize |
30 |
StripLanguage |
2 |
The required sentence-breaking files are:
Platform |
Sentence-breaking files |
Location on CD |
---|---|---|
NT |
Autonomy\MBCS\korean_nt_1_0_1.zip |
|
UNIX |
Autonomy/MBCS/korean_solaris_1_0_1.tar.Z |
The required language-specific configuration settings are:
DRE configuration parameter |
Value |
---|---|
CharConv |
2 |
TermSize |
40 |
StripLanguage |
2 |
Copyright © 2004 Novell, Inc. All rights reserved. Copyright © 1997, 1998, 1999, 2000, 2001, 2002, 2003 SilverStream Software, LLC. All rights reserved. more ...