Diacritic-sensitive search

Diacritic-sensitive searching is turned off by default in SharePoint, this means that diacritical marks are ignored at crawl time and search time. So a search for cafe will find documents containing either cafe and café.


For English documents diacritic-sensitive searching is not necessary, and even for languages such as French and Spanish it is generally better to not make searching diacritic-sensitive, in French for example it is common to omit diacritics in upper case text, which could mean that anyone searching for café would miss any reference to CAFE,  the same is also not uncommon in other languages where diacritic marks may be missing for various reasons, particularly if the text being searched has been written for example in an email or from a web form where the user may not have easy access to a full keyboard.


To turn on diacritic-sensitive searching use the STSADM command line tool in the path:


Sharepoint 2007:

cd C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\BIN


Sharepoint 2010:
cd C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\BIN


Stsadm -o osearchdiacriticsensitive -ssp *GUID* -setstatus TRUE


As shown below:



If you need to find the GUID, see this page Search Application Name for an SSP


Kenza allows you to mark thesaurus files as diacritic sensitive in the New file and Merge dialogs.