Log Configuration

In the path <DENODO_HOME>/conf/arn (where DENODO_HOME refers to the base installation path), Denodo Aracne has the log configuration file of the crawling server and in the path <DENODO_HOME>/conf/arn-index the indexing/search server configuration file. These files are based on Apache Log4j. Amongst other possibilities, it is possible to modify the path where the log files are stored and the log level of the categories defined in the application. For more information see the documentation of Log4j. The crawling server generates a file with the name arn.log in the path <DENODO_HOME>/logs/arn. The indexing server generates another log file with the name arn-index.log in the path <DENODO_HOME>/logs/arn-index.

The Web administration tool also has a configuration file log4j.xml to establish the register level of the events generated by this application. This file is located in the directory <DENODO_HOME>/resources/apache-tomcat/webapps/webadmin/denodo-aracne-admin/WEB-INF/classes. The administration tool generates two log files:

  • DENODO_HOME/logs/arn/arn-admin.log. Contains execution data of the administration tool.

  • DENODO_HOME/logs/apache-tomcat/denodo-tomcat.log. Contains the data related to the starting/installing/stopping of the administration tool in the Web server.

The log configuration of the crawling processes undertaken with MSIECrawler is found in <DENODO_HOME>/conf/arn/iecrawler. The following log files are created in the path <DENODO_HOME>/logs/arn/iecrawler (each type of log stores up to a maximum of 10 backup files with a size of 10 MB each):

  • name_task.log: File that contains the crawling event flow. It is possible to specify a file name instead of a directory for the element ROLLINGFILE filedefault in the log configuration file log.xml. In this case, one log file would contain the event flow of all the MSIECrawler tasks that are executed in the system instead of a file per task (default configuration).

  • access_url.log: Contains the list of URLs the crawler has accessed.

  • accept_url.log: Contains the list of URLs the crawler has accepted for processing.

  • reject_url.log: Contains the list of URLs rejected by the crawler, indicating the reason.

  • error_url.log: Contains the list of URLs that have produced an error when accessed (for example, HTTP 404 errors not captured by the server).

