dtSearch UK


Welcome to

Language Extension Packs

 

 

    Product Details          Language Pack Support                    

Language Extension Packs

For use with dtSearch version 6.5 or later.

dtSearch Engine/Web is supplied with stemming rules and a noise-word file for English(US). Stemming is the only search expansion option which is 'on' by default in the dtSearch end-user products; the reason for this is that stemming is almost always useful when making a search, and adds little to the time required to make a search. Unlike some other search engines, dtSearch applies stemming at search time, there is no need to build indexes specifically to apply stemming and no need to build separate indices for each language in use.

The problem

With the stemming option selected dtSearch will find plurals and many other variations; for example a search on print will find printers, printing, printed automatically.
However, if you are searching documents written in other languages, the English stemming rules will cause you to miss many word variations which do not occur in English (e.g. verb and noun changes with gender), and you may find that words which are unrelated are found in error.

Furthermore, the English noise word list, which is designed to remove unwanted English words from your index to keep the index size small, is not suitable for other languages; your indexes may contain many words which will not be useful in searches and which will add to the size of your indexes.

The solution
Use language specific files in place of the default US English files. These are supplied in the form of Language Extension Packs which contain files for many languages, see list below. All files are in Unicode format.

Language Extension Packs

Order Code LEP400 LEP402 LEP403
Western European
Danish  
Dutch  
English  
Finnish  
French*  
German*  
Italian  
Norwegian  
Portuguese  
Spanish  
Swedish  
Eastern European
Belarusian  
Bosnian  
Bulgarian  
Croatian  
Czech  
Estonian  
Greek  
Hungarian  
Latvian  
Lithuanian  
Polish  
Romanian  
Russian  
Serbian  
Slovak  
Slovenian  
Turkish  
Ukrainian  
 

 


 

 

* LEP400 and LEP402 also include unique bi-lingual French/English and German/English stemming and noise word files which enables search expansion on indexes and documents containing a mix of French/German and English text.

License: Licensed for use on a single server or workstation for use with dtSearch Engine or dtSearch Web, OR up to 5 workstations for use with dtSearch Desktop or Network. Please ask for other licensing options.

Language Packs include:

  • Stemming rule files and noise word files for each supported language
  • Test files to check the operation of stemming in all the supplied languages.
  • Stemming Language Selector application, changes stemming rules from the Windows Start menu*.
  • Multilingual Installer (English, French, Spanish, German, Dutch)

  • One year of on-line technical support and updates.


  • *User must have administrator permissions
Needs:
  • dtSearch 6.5 or later (License covers use with dtSearch Engine or Web on a single server, or dtSearch Desktop\Network for up to 5 users); other licensing available.
  • Needs Windows NT4, 2000, XP
  • Supplied on CDROM

Evaluation
A 30-day evaluation version is available; this allows English and any single language to be tried for comparison tests. Please complete the Enquiry Form.
Please enquire for languages not listed.

 
Copyright ©1995 -
ElectronArt Design Ltd