|
|
|
Full-Text
Search Features
Basic
Search Types
 |
Phrase searching finds phrases like: due process of law |
 |
Boolean operators and, or, not can be used with nested
words and phrases:
e.g. due process
of law and not
(equal protion or
civil rights) |
 |
Proximity searching finds a word or phrase within "n"
words of another word or phrase:
e.g. apple pie w/25 peach pie |
 |
Directed
Proximity
searching finds a word or phrase "n" words before another
word or phrase: apple pie pre/38 peach cobbler. |
 |
Phonic
searching finds words that sound alike, like Smythe
in a search for Smith. |
 |
Stemming finds variations on word endings, like applies,
applied, applying in a search for apply. |
 |
Numeric
range searching finds any number between two numbers, such
as between 6 and 36 |
 |
Macro capabilities make it easy to include frequently used
items in a search request. |
 |
Wildcard support in any position allows:
? to hold a single character place;
* to hold multiple character places;
= to indicate a numeric character;
e.g. apple* and not apple? sauce
e.g. inv30=====, 5522 ==== ==== ==== |
 |
See
Fields for fielded search
options. |
Fuzzy Searches
|
 |
Fuzzy
searches use a proprietary algorithm to find search terms
even if they are misspelled. Search fuzziness adjusts from
0 to 10 so that you can fine-tune your search to the level
of OCR or typographical errors in your files. |
 |
A
search for alphabet with a fuzziness of 1 would find
alphaqet; with a fuzziness of 3, it would find both
alphaqet and alpkaqet. |
 |
Fuzziness
is not "hardwired" into the index, so you can vary
fuzziness at the time of each search |
Synonym
/ Thesaurus Searches (Concept
searching)
 |
WordNet
thesaurus
searching lets you look for fast and find similar words
like quick and speedy |
 |
Variable
levels of automatic synonym expansion |
 |
User
defined thesaurus
allows you to build up a specialised thesurus for specific
terms; e.g. trade names v generic chemical names, business
jargon v formal terminology etc. |
Natural Language
 |
dtSearch
uses a vector space model to compare a search request to documents
with matching search terms |
 |
Natural
Language allows for unstructured search requests such as:
Find
me Sam's Memo on the 1999 takeover of MegaHuge Corporation |
 |
dtSearch
then does intelligent relevancy ranking using automatic term
weighting based on the frequency and density of hits in your
files. In the above example, if 1999 appeared in 3,000
files, and Sam appeared in only two files, then Sam
would get a much higher relevancy rating. |
 |
A
positional scoring option works with dtSearch's natural language
relevancy ranking to rank documents more highly when hits
are near the top of a file, or otherwise clustered in a file. |
 |
In
this way, a natural language search takes you directly from
a "plain English" search request to the most relevant
documents. |
Unicode Support
 |
Unicode
support allows for indexing and searching of non-English text,
including every character set supported by the Unicode standard. |
 |
In
addition to full Unicode support, dtSearch offers extensive
additional alphabet customisation options. |
Combining
Search Types
 |
Nearly
all search types are combinable* |
 |
You
can make your search request as complex as you want, up to
32,000 characters. |
| |
*
Natural language searching (All words) is essentially an unstructured
search request, and is not combinable with structured search
requests, such as those using Boolean (and, or, not) or proximity
(w/n) operators. |
Variable
Term Weighting
 |
Variable
term weighting works in connection with Boolean searching
to provide extra positive or negative emphasis on words |
 |
Variable
term weighting also works in connection with natural language
searching to provide extra emphasis on certain words beyond
the standard relevancy ranking. |
 |
Positive
term weighting can place extra emphasis on one or more words
in a search: soup:8 or recipe:3 |
 |
Negative
term weighting can assign negative emphasis to one or more
words in a search: red or green or yellow:-7 |
Other
 |
dtSearch
Desktop and Network provide for sorting search results (as
well as re-sorting with a click following a search) by name,
date, fields, number of hits or "relevancy". |
 |
dtSearch
Desktop and Network also include a wide range of tools for
easy search formation, including a scrolling word list and
a Browse word feature. See dtSearch
Desktop for more information. dtSearch Network supports
options packages for conveniently sharing certain search capabilities,
such as macros, user thesaurus synonym rings, alphabet customisation,
and file segmentation and rules. See dtSearch
Network for more information. |
 |
Filtering
options let you limit the files to retrieve by name, date
or size, in both indexed and unindexed searches. |
Capacity
 |
dtSearch
can create as many indexes as you need |
 |
Each
index can hold over 1 Terabyte of data (up to 2048 million
documents per index) - see
details here |
 |
dtSearch
can search as many indexes as you like with a single search
request. |
 |
Unindexed
search capacity is unlimited. |
Speed
 |
Indexed
search speed is generally less than a second, even through
multiple gigabytes of text. |
 |
Indexed
searching is optimised for multiple concurrent searching on
a network or Web site. |
| |
|
Related
Topics
|
|