Notes on Thesaurus Construction

"General to specific" (i.e. a broader to narrower term) is usually a good way to create and name Synonym groups in the dtSearch User Thesaurus. You would use the Synonym Group Name for the general term, it is better to use the plural term, since when people are searching they tend to think in terms of “where will I find information on clocks”.

 

The relationship between the broader term and the narrower terms should be true independent of context, for example you could list mice under a group name of rodents, but if you listed it under pests it would not be correct, because some mice – pet mice, laboratory mice – are not considered pests.

 

Example:

 

Group Name

Entries in the thesaurus

jackets

jackets

anoraks

blazers

boleros

dinner jacket

donkey jacket

flying jacket

harrington jacket

kagouls

sports jacket

tweed jacket

 

Although User Thesaurus Plus will de-duplicate terms it is better when preparing a long list to enter them in alphabetic order so the likelihood of repeating items is reduced. It is important to make sure you include the general term in the synonym list, User Thesaurus Plus will automatically copy the Group Name you enter as the first item on the synonym list.

 

 

Because the thesaurus and stemming both conflate the search terms to enhance recall, a side effect is that precision will fall, therefore you may need to take extra steps to maintain precision. For example if your index contains documents on a wide range of topics, you may find that your search results have many non-relevant documents, for this reason it is preferable that where possible you should offer your users the possibility of searching in particular 'zones' – the index for each zone could contain just documents from a particular department or even just certain types of document from a department (e.g. research or project reports, marketing collateral).

 

When you are searching you can also narrow the domain of you search by using a proximity search, for example:

 

 jackets not w/5 potato


will find documents containing the word jackets (or any of the synonyms you have set up in the user thesaurus) but not if the word potato is within a distance of 5 words away; be careful to not simply use - jackets not potato - because this could miss a document which happened to have the words potato and jacket(s) anywhere in the same document.