Administration of Index Schemas

Denodo Aracne allows configuring which fields will be contained by an index and their indexing features. Although it is normally not necessary, it is possible to create, edit, or eliminate configurations of index schemas through the Aracne administration tool.

In order to create a new Lucene index schema, the following parameters must be specified:

  • Schema name. The name of the schema which will be used in order to reference it from the index creation/edition screens.
  • Unique key. This indicates the name of the schema field which represents the primary key.
  • Default search field. This specifies the name of the schema field through which the searches will be made on the index, when a field in the query is not specified explicitly. This field will be created by Aracne and will contain the concatenation of the content of all the searchable fields in the schema.

Additionally, it is possible to specify certain data for specific schema fields. In order to specify certain indexing properties for a field it is necessary to add an entry in the section Customized fields, thus allowing the configuration of the following properties:

  • Name. Name of the field to which the properties specified below are applied.
  • Index. This allows specifying whether searches can be made by a field and its type. The possible values are:
    • NO. Do not index the field, i.e. no searches are allowed by this field.
    • ANALYZED. This indexes the value of the field using the analyzer used by the index.
    • NOT_ANALYZED. This indexes the value of the field, but without using the analyzer. This also allows carrying out searches by this field.
  • Store. This allows specifying whether the field is stored in the index (i.e. if it will be shown as part of the document when it is returned as a search result). The possible values are:
    • NO. It does not store the field value in the index.
    • YES. This stores the original value of the field in the index.
  • Boost. This specifies the default relevance of the field in the searches. This is a positive value. When raising this value, more importance will be given in the searches to the documents which contain occurrences of the words searched for in this field.
  • Search. This specifies whether the content of the field is to be stored in the Default search field so that it is included in the global searches (when no field is specified in the schema). The possible values are:
    • NO. The content of the field is not stored in the default search field.
    • YES. The content of the field is stored in the default search field.

By default, all the non-binary fields of the document sent to the indexing server will be stored in the index (Store = YES), divided into words (Index = ANALYZED), with relevance 1 (Boost = 1), and their content is included in the Default search field.

The schema of default indexes (standard) defines the “identifier” field as the primary key and as the default search field “searchableContent”. It also defines the fields “path” and “mimetype” as not analyzed (Index = NOT_ANALYZED), stored in the index (Store = YES), and its content is not stored in the default search field (Search = NO).

There is another default schema (globalsearch) which is intended to be used with the Denodo Global Search crawler. This schema is defined as the standard schema, with some additional fields: “name”, “type” and “self”. These three fields are configured as not analyzed (Index = NOT_ANALYZED), stored in the index (Store = YES) and its content is stored in the default search field (Search = YES).