Jobs for Maintaining Aracne Indexes

The ARN-Index maintenance section allows configuring maintenance operations to be performed on one or several Denodo Aracne indexes. The operations that can be added are as follows (it is possible to add one or several):

  • CHECKURI. This action is used to detect and remove documents that are no longer available in the web. It allows specifying a query to be executed (Query parameter), the list of indexes on which the query will be executed (Indexes), the name of the field in the index that contains the URI (URI field) of the document, the primary key field name for the document in ARN-Index (Identifier field, by default with “identifier” value) and the threshold of times that the access to a URL may fail before proceeding to its deletion from the index (URI errors threshold). For each document obtained as a result of performing the query, the job checks whether its URL remains accessible and does not return any errors. If the number of consecutive times that a document returns an error exceeds the threshold configured by the administrator, the document is deleted from the index. In turn, if a document that previously returned an error is accessible in the following execution of the maintenance job, its error counter is again reset to 0.
  • DELETEDOCUMENTS. Allows a query to be specified (Query parameter) and one or several indexes on which to perform the query (Indexes parameter). The documents obtained as a result of performing the query will be deleted from the indexes.

Note

The query syntax for both actions is documented in the appendix Apache Lucene Search Syntax of the Aracne Administration Guide.