KriterionSearch Engine and Information Analysis
What is Kriterion?Kriterion is a fully functional implementation of a document retrieval method.
Work began on day 29 of November 1999 as part of a suite of tools that were intended to result in a SIGINT package.
The search engine showed first light in 2000 when all major bugs could be removed.
What are the system requirements?In preparation to run Kriterion the following requirements must be fulfilled:
What is part of the package?
How does it work exactly?The search engine uses documents in the format of ASCII text.
All documents are parsed in a special manner and a score for each document is calculated.
After processing, the main program listens on a port that has to be configured via argument passing before starting the service.
Now a user calls up the Webinterface and a java servlet provided within the UI connects to the port of the main programm.
A search query can be typed into an input field and is sent to the main programm.
Over a simple, proprietary protocol the client communicates with the server and get's synchronously the result.
Now the documents that best match the score of the query are returned.
Documents in the result list can be displayed and passages containing words of the search query are highlighted.
Additionally a user can press the button "find similar documents" after having marked a document in the result list.
Now the documents with highest similarity are returned.
The results are always returned in realtime due to comparing just the scores on the server side.
More than 6 years ago, the calculation of the scores took an enormously time: On day 15 of december 1999 a Pentium-III 500Mhz took 23 hours to index 1582 normal sized documents.
Luckily, this got faster and faster over the years: A Pentium IV takes today 28 Minutes for 2127 documents to index.
Last updated: August 3, 2006