April 2014 marks the anniversary of the Offshore Leaks financial scandal that unmasked details of 130,000 offshore accounts. Although normal businesses may use the offshore legislation to ease formalities in international trade, some observers called it the biggest international tax fraud discovery to date.
The report originated from the non-profit International Consortium of Investigative Journalists (ICIJ), who collaborated with journalists around the world to produce investigative reports based on a cache of 2.5 million records. The total size of the files (about 260 Gb) was more than 100 times larger than the Wikileaks scandal of 2010.
Key tools used in the investigation were dtSearch and NUIX. One of the needs was to search for lists of names contained in the millions of emails and other files. Many people are not aware that dtSearch Desktop allows you to simply search using a list of names from a text file, rather than having to type in names one at a time or having to write a long Boolean search term like (thisName) OR (thatName) OR..., etc.
If you need to search for a list of names, simply enter them in a column in a Windows Notepad file or an Excel file, then from dtSearch Desktop select "Search for List of Words" (Ctrl+Shift+W) from the Search Menu, choose the "one word or phrase per line" option then browse to your name list file. You can view the results directly or export the search results to an Excel file or plain text file for further investigation later.
To make sure you don't miss names because of misspellings or because the names may be transcribed into Cyrillic or Arabic alphabets, you can select the User Thesaurus option and with the User Thesaurus Plus add-on product create a file* containing all alternative spellings, aliases, maiden names, diminutives, or nicknames, so that your search for any one of the names will find all variants! See "Searching for Names" under the "Working with Thesaurus, Macros and Alphabet files" here: /support/WebHelp/UTP500/UTP500.htm
Another technique when searching for names is to use the w/2 proximity connector; to ensure that a search for Robert Smith will also find Smith, Robert or Robert Edward Smith you simple search using Robert w/2 Smith. You can add names like this in a list using the "One Boolean expression per line" option.
Finally, don't forget if you are unsure of the spelling of a name, use the Phonic option to expand the search to similar sounding names - this is based on the well known Soundex algorithm to find name spelling variants.
* Free sample files with name variants are included with User Thesaurus Plus.
For more information on the offshore leaks investigation: