Content Search Guide
CHAPTER 8
This chapter provides troubleshooting tips to help you implement conceptual searching successfully in your exteNd Director applications. You will learn how to diagnose and correct commonly encountered errors.
The following topics are covered:
This section diagnoses commonly encountered problems and suggests corrective actions. The following issues are covered:
Class not found exception for Autonomy JNI when accessing the Content Management (CM) subsystem
Search results become invalid after restarting the DRE service
java.lang.Exception for Autonomy JNI when publishing documents on UNIX
This section explains why you might encounter this exception and describes how to correct the problem.
The Autonomy Java Native Interface (JNI) usually throws this exception if the exteNd Director Dynamic Resource Engine (DRE) is not running.
This section explains why you might encounter this exception and describes how to correct the problem.
This exception is thrown if the Autonomy Java Native Interface (JNI) classes are not on the classpath of your application server. These classes are stored in autonomy\autonomyJNI.jar in theexteNd Director installation directory.
Add Autonomy JNI classes to the classpath of your application server, as described in Adding autonomyJNI.jar to your application server classpath.
This exception is thrown if the autonomyJNI.dll is not on your library path. This dynamic link library is located at autonomy\autonomyJNI.dll in the exteNd Director installation directory.
Add the directory containing autonomyJNI.dll to the Path environment variable of the machine where you installed exteNd Director, as described in Adding the Autonomy dynamic library to your environment.
You may see the following error message when you redeploy the exteNd Director project:
java.lang.UnsatisfiedLinkError: Native Library autonomyJNI.dll already loaded in another classloader
This section explains why you might encounter this error and describes how to correct the problem.
The error occurs when autonomyJNI.jar and autonomyJNI.dll are not at the same revision level.
Make sure you have the correct revisions of these files. You can check revision numbers programmatically by calling the method getApiVersion() on com.sssw.search.api.EbiQueryEngineDelegate.
This section explains why you might encounter this behavior and describes how to correctand preventthe problem.
This problem occurs when you add new custom fields in the Content Management (CM) repository after creating documents that use the preexisting set of custom fields. Because of the way Autonomy handles custom fields, you must reinitialize the DRE to read in the new field set. Otherwise, search results are invalid.
Remove all documents from the DRE, as described in Removing content from the DRE.
Reconfigure the DRE by issuing a reset from the DRE Administration console, as described in Resetting the DRE.
Restart the DRE, as described in DiagnosisDRE is not running.
Reindex your contents back into the DRE, as described in Forcing indexing.
CAUTION: You must perform these steps every time you add new custom fields after creating documents that use custom metadata. To avoid this problem, see the preventive action below.
To prevent this problem from occurring:
Add all custom fields before adding any documents in the CM repository.
You can check whether or not documents have been indexed by using any of the methods described in Examining DRE content.
This section explains why you might encounter this behavior and describes how to correct the problem.
The integration between the CM subsystem and the Search subsystem is disabled.
Make sure the exteNd Director DRE is running, as described in DiagnosisDRE is not running.
Enable the option com.sssw.cm.search.enable.repository name.
For example, if you are working with the default CM repositorynamed Defaultthe property name will look like this:
com.sssw.cm.search.enable.Default
TIP: You set this option in the CM config.xml file, as described in Setting search options in an existing exteNd Director project. For more information about this option, see Enable link to the Search subsystem? and Defining options for a specific Content Management repository.
Redeploy your exteNd Director project for the new setting to take effect.
The values of search options do not correspond to exteNd Director DRE settings.
Check your exteNd Director DRE settings in the DRE Administration console, as described in Setting DRE search options.
Configure the following search options to match the DRE settings:
com.sssw.cm.search.host.repository name com.sssw.cm.search.queryport.repository name com.sssw.cm.search.indexport.repository name com.sssw.cm.search.repository.repository name
TIP: You set these options in the CM config.xml file, as described in Setting search options in an
existing exteNd Director project. If you are using the default CM repository, repository name
=
Default
.
Redeploy your exteNd Director project for the new settings to take effect.
There are two modes for synchronizing changes in the CM repository with DRE indexing:
Determine which synchronization mode is enabled by checking the value of the following search option:
com.sssw.cm.search.synch.mode.repository name
If the value is 1, synchronization occurs in batch mode and you should not expect to see your documents indexed immediately.
TIP: You view this option in the CM config.xml file, as described in Setting search options in an existing exteNd Director project. For more information about this option, see Synchronization mode.
The document type of the content you are trying to index is invalid or unsupported. The Search subsystem supports the following MIME types for indexing content:
Make sure the MIME type of your document is supported by the Search subsystem. You can check document MIME types in the CMS Administration Console by following these steps:
Select the document of interest in the CMS Administration Console.
In the property inspector, select the Versions tab.
The MIME type of the document is displayed, along with other properties.
For information on how to use the CMS Administration Console, see the chapter on the CMS Administration Console in the Content Management Guide.
You may not have published the documents you are searching for in the CM subsystem. Only published content can be imported and indexed in the DRE.
In the CMS Administration Console, determine whether the documents of interest have been published by following these steps:
In the Property Inspector, select the Versions tab.
If the document has been published, one of its version icons appears with a green border:
For more information about publishing documents, see the section on administering version control in the chapter describing the CMS Administration Console in the Content Management Guide
If you are indexing binary documents, you must specify the correct path to Autonomy's OmniSlave binary document filtering technology. By default, the OmniSlave files are stored at:
exteNd Director installation directory\exteNd Director\autonomy\OmniSlaves
Make sure the path to OmniSlave files is specified correctly in the following search option:
com.sssw.cm.fetch.binary.filters.dir
TIP: You set this option in the CM config.xml file, as described in Setting search options in an existing exteNd Director project. For more information about this option, see Install directory for binary document text filters.
If you change the path, redeploy your exteNd Director project for the new setting to take effect.
This section explains why you might encounter this behavior and describes how to correct the problem.
Your search criteria may be too narrow or incorrectly specified, or your query terms may be misspelled.
Examine your query and take any of the following corrective steps as necessary:
Correct misspelled query terms or try a fuzzy query.
For more information, see Fuzzy queries.
You can check whether or not documents have been indexed by using any of the methods described in Examining exteNd Director DRE content.
See the troubleshooting tips in Documents do not appear to be indexed.
If you change parameters in the DRE configuration file without reinitializing the DRE and reindexing the data, the DRE produces no results or erroneous results.
NOTE: The DRE configuration file is located at autonomy\engine\DirectorDRE.cfg in the exteNd Director installation directory.
Reconfigure the DRE by issuing a reset from the DRE Administration console, as described in Resetting the DRE.
Restart the DRE, as described in DiagnosisDRE is not running.
A common scenario is to issue a conceptual query when you really intend to run a keyword search. In a keyword search, the DRE finds documents that contain occurrences of the desired keyword. By contrast, the conceptual query is an intelligent search that matches concepts rather than literal text strings.
For more information, see How conceptual searching differs from keyword searching.
Make sure you are using the correct syntax for the type of query you want to run. For example, if you want to search for documents that contain the words silk and worm, use the query notation for keyword search:
silk:+worm:
Notice that this syntax is different from conceptual search notation:
silk+worm
For more information, see Overview of Autonomy-based conceptual searching and Querying Content and Metadata.
You may not have enabled the option to copy the content of documents you are searching into the exteNd Director DRE. If you are issuing a keyword query, you must make sure the content of the target documents is stored in the DRE when they are indexed.
Enable the following search option:
com.sssw.cm.fetch.store.content.repository name
You set this option in the CM config.xml file, as described in Setting search options in an existing exteNd Director project. For more information about this option, see Copy document contents into the DRE?.
Redeploy your exteNd Director project for the new setting to take effect.
If the relevance cutoff threshold is too low, the DRE will drop some results that you actually want to see.
Bump up the threshold by calling the method setRelevanceCut() on com.sssw.search.api.EbiQuery.
This section explains why you might encounter this behavior and describes how to correct the problem.
You may not have enabled the option to copy the content of documents you are searching into the exteNd Director DRE. This option is disabled by default to avoid incurring the overhead of storing content in both the CM repository and the DRE.
com.sssw.cm.fetch.store.content.repository name
You set this option in the CM config.xml file, as described in Setting search options in an existing exteNd Director project. For more information about this option, see Copy document contents into the DRE?.
Redeploy your exteNd Director project for the new setting to take effect.
This section explains why you might encounter this behavior and describes how to correct the problem.
The binary document text filter directorywhich resides in the directory where you installed the DREcontains executables required for importing data from the CM repository into the DRE for indexing. By default, publishing is an operation that triggers immediate synchronization, an event that involves importing updated content from the CM repository into the DRE. You must have read/write/execute permission for the binary document text filter directory so that the import process can proceed to completion.
Find your binary document text filter directory:
In exteNd Director, open config.xml for the CM subsystem in your exteNd Director project.
com.sssw.cm.fetch.binary.filters.dir
The value of this key is the path for the binary document text filter directory.
NOTE: You set the binary document text filter directory when you created your exteNd Director project, as described in the section on creating a new exteNd Director project in Developing exteNd Director Applications. In this section, look for information about setting parameters on the Filters tab of the Content Management Search Configuration panel.
Set read/write/execute permission on the binary document text filter directory.
This section describes techniques you can use to determine whether search processes are running as expected and whether you have constructed your queries correctly. Some of these techniques require you to run the exteNd Director DRE Administration console, which is described in Administering the Dynamic Reasoning Engine. These topics are included:
This section describes how to monitor the indexing process by generating and examining exteNd Director and Autonomy logs.
The import log records the activity of the Autonomy importer at runtime. By default, this log resides in autonomy\OmniSlaves\import.log in the directory where you installed exteNd Director.
You configure the behavior of the import log in the file importslave.cfg, located in the same directory as import.log. You can specify the following options:
When enabled, this option writes content to the server console for debugging purposes as documents are imported and indexed.
To log information about indexing in exteNd Director:
Raise the logging levels of the CM subsystem and the Search subsystem to 5, preferably on a small prototype document set.
Level 5 logging records debugging messages and information about application progress on the server console as you interact with the CM repository and search its content.
Use the Director Administration Console (DAC) to adjust logging levels for EboSearchLog and EboCmLog, as described in the section on logs in the chapter about application configuration using the DAC in the Content Management Guide.
Enable the following search option:
com.sssw.cm.fetch.dump.imported.data
You set this option in the CM config.xml file, as described in Setting search options in an existing exteNd Director project. For more information about this option, see Debug during import?.
Redeploy your exteNd Director project for the new setting to take effect.
Monitor the index process on your server console as you run your search application.
For more information, see the chapter on logging information in the Developing exteNd Director Applications.
You can view the log of activities performed by Autonomy by entering the following URL in your browser:
http://DRE host:DRE-query-port/qmethod=v
For example, if your host name is localhost and your DRE-query-port is 2000 (the default), the URL should look like this:
http://localhost:2000/qmethod=v
This section describes several ways to examine the content of the exteNd Director DRE.
To examine DRE contents through your browser:
Issue the following command from your browser:
http://DRE host:DRE-query-port/qmethod=g
For example, if your host name is localhost and your DRE-query-port is 2000 (the default), the URL should look like this:
http://localhost:2000/qmethod=g
This command lists the documents that have been indexed. You can identify any document of interest by looking up its Doc_id property value (the unique identifier of the document within the DRE). This value appears in the results generated by the qmethod=g command.
To examine DRE contents by backing up the DRE:
There are situations in which you need to force the exteNd Director DRE to reindex data for example, when you reconfigure the search environment, as described in Setting Search Options.
The following procedure shows how to configure the exteNd Director DRE to reindex all content as a batch process.
Open config.xml for the CM subsystem in your exteNd Director project.
Change the synchronization mode to batch by setting com.sssw.cm.search.synch.mode.repository name to 1.
TIP: By default you use the CM repositorynamed Default. So you would set synchronization mode on com.sssw.cm.search.synch.mode.Default.
In the same location as config.xml, open the task list configuration file for your CM repositoryrepository name_tasklist.xml.
TIP: For the default CM repository, the configuration file is Default_tasklist.xml.
Look for the definition of the synch taskthe task that synchronizes the CM subsystem with the Search service engine.
TIP: The task definition appears as either <periodic-synch> (default) or <scheduled-synch>.
Set the interval (for periodic-synch) or the schedule (for scheduled-synch) as desired.
Specify that all content should be reindexed, by adding the following element to the synch task definition:
<since-last>false</since-last>
NOTE: If you do not disable this property, the DRE indexes only the content that hasn't been processed in the previous run of the task (the default setting).
Here is a sample synchronization task definition that meets these requirements:
<periodic-synch> <task-name>Default Repository Synchronization</task-name> <description>The Default Repository Synchronization Task</description> <since-last>false</since-last> <enabled>true</enabled> <interval> <millis>86400000</millis> <exact>false</exact> </interval> </periodic-synch>
TIP: After reindexing the content, it is recommended that you set the <since-last> property back to true to avoid reindexing all content again unnecessarily.
Start the synch task from the CMS Administration Console, as described in the section on administering automated tasks in the Content Management Guide.
For more information about tasks in the CM repository, see the chapter on managing tasks in the Content Management Guide.
You may want to examine the list of terms indexed for a specific document to verify that the correct information is in the DRE. You can retrieve the 40 most important terms from a document using this command:
http://IPAddress:QueryPort/qmethod=t&querytext=docid
NOTE: The value docid is the Doc_id property for the document of interest. You can look up Doc_id values as described in Examining exteNd Director DRE content.
You can use the exteNd Director DRE Administration console to test your queries in isolation to validate whether your queries return the expected results.
To test queries in the DRE Administration console:
See Testing queries.
For more information, see the following Autonomy documents shipped as PDF files with the exteNd Director help system:
Copyright © 2004 Novell, Inc. All rights reserved. Copyright © 1997, 1998, 1999, 2000, 2001, 2002, 2003 SilverStream Software, LLC. All rights reserved. more ...