![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Content Search Guide
CHAPTER 1
This chapter provides an overview of the searching methods you can implement in exteNd Director applications.
The following topics are covered:
You can implement the following types of searching in exteNd Director applications:
Autonomy-based search technology gives you the ability to implement conceptual and keyword searching in your exteNd Director components. Traditional keyword searching returns all documents that contain occurrences of a search string. By contrast, conceptual searching matches concepts, often returning more relevant results.
NOTE: The information in this section is adapted from the Autonomy Technology White Paper from Autonomy, Inc.
The Autonomy Dynamic Reasoning Engine (DRE) uses sophisticated pattern-matching algorithms to analyze any type of unstructured information, including documents in text and binary formats. Using these algorithms, the DRE identifies the patterns that occur naturally in text, then looks for similar patterns in the data source and returns the most relevant results.
The DRE determines relevance by performing probabilistic analysis to determine what data is most important, then assigns weights to indexed terms based on their importance.
NOTE: The information in this section is adapted from the Autonomy Technology White Paper from Autonomy, Inc.
Recall that traditional keyword searching is the process of finding documents that contain text strings specified by a user. Keyword searches return all documents that contain one or more occurrences of the search string, regardless of the context in which it is used. Because context is ignored, the results frequently contain many irrelevant hits. To refine search results, users often must modify their queries by adding complex boolean expressions. Keyword searching is also known as full-text searching.
By contrast, conceptual searching does take into account the context in which search terms appear so that it can match concepts rather than simply finding literal text strings. The result set contains content that is related by meaning and ranked by relevance to the search criteria. In this way conceptual searching reduces the number of false hits by returning documents that contain the concept, whether or not they also contain the search string.
To further illustrate the difference between the two approaches, consider this example. A keyword search for the term The+effect+of+the+recession+on+consumer+spending would return only documents that contain occurrences of all of these terms, likely producing a number of irrelevant results. The identical conceptual search would return documents that match the concept underlying the search expression, even if the documents don't contain all the terms in the query.
exteNd Director comes with a data fetcher for the exteNd Director CM repository. This CM fetcher automatically propagates document content and metadata from the CM repository into the exteNd Director DRE where it is indexed. The related processes of propagating and indexing data is often called fetching.
The exteNd Director CM subsystem communicates with the exteNd Director DRE through the Search subsystem. The CM API wrappers the Search API, providing classes and methods for constructing and running queries on content and metadata that reside in the CM repository and have been indexed by the exteNd Director DRE.
For more information on using the CM API for implementing conceptual searches against the CM repository, see Implementing Conceptual Search, Fetching Content and Metadata, and Querying Content and Metadata.
The CM data fetcher that comes with exteNd Director allows you to use Autonomy technology exclusively with data from the exteNd Director CM repository. This fetcher automatically imports document content and metadata from the exteNd Director CM repository into the DRE for indexing, allowing you to subsequently conduct Autonomy-based searches over the indexed data.
To use Autonomy technology with exteNd Director to search other data sources, you must purchase additional data fetchers from Autonomy, Inc. For these licensed data sources, you use Search API classes directly to initiate the fetching process, and construct and run queries. Fetching occurs automatically only when you use the CM data fetcher.
The Search API provides wrapper classes around the Autonomy APIs to give you access to the following capabilities programmatically:
Fetch (import and index) content into the exteNd Director DRE
Perform conceptual searches using a variety of query types, including fuzzy, proximity, and thesaurus searches
Search both structured data (document metadata) and unstructured data (content) using a single query expression
Use Suggest More queries to find documents similar in meaning
Page through the results of Autonomy-based conceptual and keyword queries
Rank or limit the query results by relevance percentages or absolute weight
For more information about how to access and implement these capabilities, see Implementing Conceptual Search, Fetching Content and Metadata, and Querying Content and Metadata.
The exteNd Director CM subsystem provides a built-in capability for SQL-based searching of metadata in the CM repository. You execute SQL search queries on document metadata only.
To search document content—or both content and metadata—use Autonomy-based searching, as described in Overview of Autonomy-based conceptual searching.
SQL-based searching allows you to search metadata stored in relational databases. You might opt for this search method in exteNd Director to:
Take advantage of the rich set of operators SQL provides, including IN and BETWEEN
Search metadata that is available only through SQL-based document queries—and not via Autonomy-based search—such as category memberships or information about document links
You can use SQL queries to search for the following metadata properties in the CM repository:
For these properties, the CM API provides classes and methods for constructing and running SQL query expressions that search for values, ranges of values, words, phrases, or other patterns, as appropriate.
The CM API provides methods on the com.sssw.cm.api.EbiDocQuery object for defining SQL clauses that you use to construct search queries. In exteNd Director, you construct SQL-based queries by defining SELECT, WHERE, and ORDER BY clauses.
The com.sssw.cm.api.EbiDocQuery interface defines WHERE methods for setting search criteria. In addition, com.sssw.cm.api.EbiDocQuery extends the com.sssw.cm.api.EbiDocMetaDataQuery interface which defines SELECT and ORDER BY methods:
For more information, see Implementing SQL-Based Searching.
Copyright © 2004 Novell, Inc. All rights reserved. Copyright © 1997, 1998, 1999, 2000, 2001, 2002, 2003 SilverStream Software, LLC. All rights reserved. more ...