Administration of Indexes¶
The indexing server manages a set of indexes in which documents can be stored or about which queries can be made.
The indexes administration screen allows creating new indexes, editing the configuration of existing indexes, or deleting them. In order to create or edit an index the following information must be specified:
Index name. Name of the index.
Index path. The path in the file system in which the metainformation and index data will be physically stored.
Analyzer type. The analyzer to be used should be selected in accordance with the language expected for the documents. Stopwords (those words that are very common in the language) will be removed.
Some analyzers use “stemming”. Those analyzers attempt to eliminate the most common morphological endings from the words in a document before it is indexed. The objective is to ensure that a search for a specific keyword also returns the documents that contain other words with the same lexical root. For example, if we look for the word “trade”, documents that contain words such as “to trade”, “trades” or “trading” are also returned.
Stemming techniques may or may not be suitable, depending on the use the application is going to be given. It is also important to bear in mind that stemming techniques are based on a series of general rules that can accept certain exceptions. This means that, in some rare cases, the system can erroneously identify the lexical roots of some words.
Denodo Aracne includes three own different analyzers:
- standard. This considers the list of stop words in English, but does not use stemming.
- english. This considers the list of stop words and stemming for English.
- spanish. This considers the list of stop words and stemming for Spanish.
It also includes some built-in analyzers provided by Lucene for the following languages: Arabic, Armenian, Basque, Brazilian, Bulgarian, Catalan, Chinese, Japanese, Korean, Czech, Danish, Dutch, Finish, French, Galician, German, Greek, Hindi, Hungarian, Indonesian, Irish, Italian, Latvian, Norwegian, Persian, Portuguese, Romanian, Russian, Swedish, Thai, and Turkish. The name of those analyzers in the Web Administration Tool is: <language> (lucene).
Schema. The index schema allows specifying which fields will be included in the index. In section Administration of Index Schemas a description is given as to how to administer schemas of indexes and the configuration of the schema included by default with Denodo Aracne (standard).
The indexes administration screen also allows searching in its content (Search in Index link) and the index administration screen allows deleting the content of an index (Delete Index Content link).
The distribution of Denodo Aracne includes two pre-created indexes:
- default: uses the standard analyzer and the standard schema.
- globalsearch: uses the standard analyzer and the globalsearch schema. This is intended to be used with the Denodo Global Search crawler.