Language Matters: Cyrillic searchLet's
suppose you are interested in Ukrainian[1] politics
and for example want to find out more about the visit of President Victor
Yuschenko to Poland. You would go to your favourite Web search engine
and enter the words: Official visit Victor Yuschenko Poland When
this query was tried on Google[2] with preferences
set to 'any language' it gave 5060 results - all in English. Ideally
of course you would want to get a wider perspective than from an Anglo-Saxon
viewpoint - the same search query expressed in Ukrainian is:
Офіційний
візит Віктора
Ющенка Польщі This query gave 471 results (8 pages of 30 results actually viewable) Perhaps
if you were not able to get access to a Ukrainian keyboard, you might
have tried a Russian keyboard and entered: Офіційний
візит Віктора
Ющенка Польщi Looks
the same doesn't it, but this query gave just 7 results, none of them
the same as the previous search! Why can that be you wonder? The explanation is that the webmasters of those sites also probably used a Russian keyboard; unfortunately a Russian keyboard does not contain a key for the letter 'i', so instead the webmaster has substituted an 'English i'. The difference is that the Ukrainian 'i' has a Unicode encoding of U+0456.whereas the English 'i' is encoded as U+0069. At
the time of writing Google and many other search engines do not take
this into account, and so you or your organisation could miss vital
information. Consider
this: 2. All the Internet search results were obtained on www.google.com on 12 May 2005, results at other times and on other web search engines may differ. ©.Copyright 2005 dtSearch UK. This article may be copied in its entirety but must include this copyright notice |