|
"dtSearch's
powerful search engine, rich feature sets, and
professional technical support staff allowed IE
Discovery to develop InfoDox.com in a fraction
of the time normally required for a project of
this scope."
|
|
Text
Retrieval Engine is Crucial to A Web-based Document
Repository
The
evolving technologies of the Internet, along with the
demands of the litigation process, were the driving
factors behind the development of InfoDox.com,
IE Discovery's Web-based document repository. As IE
Discovery developed, offering off-the-shelf litigation
and document management applications and designing custom
databases, the concept of a comprehensive Web-based
document repository began to evolve. To meet the varied
demands of its legal clients, an efficient interface
to the Internet was crucial.
To insure that InfoDox.com
met its design criteria, IE Discovery needed a search
capability that was fast, secure and flexible enough
to hold warehouse-sized volumes of documents yet sufficiently
cost-effective to support smaller document projects.
It was also important that the systems have powerful
searching capabilities, be continuously up-to-date,
have Internet accessibility, but not burden the user
with a tremendous amount of hardware and software maintenance.
The
development team realised that the ability to quickly
sift through very large collections of textual and fielded
document data and deliver document images over the Internet
was fundamental. IE Discovery's primary sources of documents
are litigation documents, or documents related to legal
issues. These "cases" tend to evolve over
time and a solution would need to be flexible enough
to adapt to changing circumstances. For example, if
a case is related to injuries persons received due to
a company's alleged negligence with its chemical waste
products, it may be months or even years until all medical
records are obtained from all parties involved. The
document collection size could double or more in size.
In contrast, a client may want to have all their case
documents loaded into a database in order to weed through
and remove any documents that are not important to the
case. In this example, the database size could reduce
drastically from where it began.
Because
these document collections can be massive, InfoDox.com
also had to provide the ability to jump directly to
the specified text and image providing both a view of
the documents page and taking the viewer directly to
line of text searched upon in the retrieved files. Document
collections are collected from numerous sources. Because
these documents are all received in different file formats,
some hard copy, others may be taped video depositions,
the system needed to receive files from a wide variety
of sources. Handling multiple file types was essential,
as was the need to convert these files to formats that
could be displayed and searched from the Web. Through
months of research, the IE Discovery development team
was able to meet these product requirements as well
as deliver efficient image delivery over the Web.
Each
InfoDox.com database consists of a document and related
"names mentioned" table. For each new database
loaded into InfoDox.com, the customer is asked to answer
the following questions concerning their data needs:
1)
What properties (fields) of the documents do you want
available to capture? Examples: Document Date, Document
Type, Bates Number, etc.
2) Which of the above fields should pick-lists be made
available for searching and entering data to ensure
consistency in the data?
3)
What values should be available for each of the pick-lists
described in step 2?
|
|
"dtSearch's
Text Retrieval Engine provided all of InfoDox.com's
search needs including highlighted results in
full text searches."
|
|
Once
the customers requirements have been determined, a database
administrator at IE Discovery will design the appropriate
database structure and build the database as part of
the InfoDox.com service. IE Discovery takes data, or
documents in this case, and breaks them down into fielded
data (data entry either within InfoDox or loaded from
a data entry vendor), images (scanned hard copies of
the documents) and OCR (full text from the images).
All the converted data is then loaded into the InfoDox
application. All document processing, imaging, coding,
OCR, etc., is provided by IE Discovery if needed. At
this stage InfoDox.com utilises the dtSearch Full Text
Retrieval Engine and a relational database search engine
resulting in delivery via the users Web browser using
standard HTML. The user receives images that are the
result of combining a specialised high-compression scheme
with streaming text which is converted to HTML with
hit highlighting and fielded data.
dtSearch's
Text Retrieval Engine provided all of InfoDox.com's
search needs including highlighted results in full text
searches. By combining their search engine and advanced
Internet imaging technology, the specified speed and
performance requirements were achieved. dtSearch's powerful
search engine, rich feature sets, and professional technical
support staff allowed IE Discovery to develop InfoDox.com
in a fraction of the time normally required for a project
of this scope. The final product allows clients to search
mountains of textual data instantly while incorporating
regular expression, phrase, Boolean, stemming and thesaurical
searching capability. Users can even take advantage
of the advanced fuzzy searching capabilities of the
engine to find documents even if the terms they seek
are misspelled due to typographical or OCR interpretation
errors.
To
ensure that InfoDox.com documents remain secure, IE
Discovery incorporated a protocol called Secure Sockets
Layer (SSL) to identify authorised users, prohibit unauthorised
access, and encrypt data transmissions from the document
repository to the user's Web browser. This is the same
high-end technology used by large financial institutions
to transmit their extremely sensitive data and the Internet
standard for processing credit card transactions.
With
the introduction of InfoDox.com, IE Discovery, Inc.
pioneered the use of the Application Service Provider
(ASP) model to the field of document imaging and litigation
support. While InfoDox.com's success is due to several
technologies that IE Discovery combined to produce a
powerful, yet cost-effective solution for their clients,
the performance of the dtSearch Text Retrieval Engine
was an important factor. "The product allows InfoDox.com
users to search millions of documents in seconds instead
of minutes or hours."
For
further information, please call IE Discovery at +1
(512) 833-5588, e-mail: sales@iediscovery.com, or visit
IE Discovery online at www.iediscovery.com
|