You can translate the document:

Overview

The Denodo Monitor is a tool for logging activity information  of the Denodo Platform. This information is useful for feeding AI algorithms, like summary recommendations, to show activity stats in the Data Catalog, for security auditing, and for detecting and analyzing potential issues. However, as this has potential to add overhead, this document seeks to quantify that level of overhead and provide recommendations.

Recommended Architecture

Denodo Monitor is included both in the VDP server installation as well as the Solution Manager. In general, it is recommended to run Denodo Monitor from Solution Manager for the following reasons:

  • Lower overhead. Impact to the system can be minimized by having the monitor running from the Solution Manager rather than the Denodo server, so as to avoid an impact on the VDP server’s CPUs.
  • Easier configuration. The most common configuration settings can be configured graphically via the Solution Manager Admin Tool. This includes what elements will be monitored, and where to store them (local disk, cloud object store or relational database). For the VDP Server, configuration is through properties files
  • Simpler management. Denodo Monitor can be started/stopped graphically from Solution Manager, not just for an individual server but for an entire cluster. For the VDP Server, it needs to be started as a shell script

In addition, the Denodo Monitor can be configured to write only to log files, or additionally to a Database. Writing to a database carries with it some advantages:

  • Ease and performance of analysis. As logs written to databases can be accessed via JDBC, this can provide performance benefits when computing summary recommendations or conducting other analysis. If you make use of Splunk to analyze logging information, Splunk DB Connect can be utilized to obtain the logging information directly from the database.
  • Disk space management. Writing to a DB means you can safely remove old log files to avoid disk space issues on the Denodo Monitor machine. We recommend configuring this to be done automatically.

Results

To estimate overhead we performed tests using TPC-H datasets of varying scale factors using a PrestoDB source under simulated load. These tests measured three separate scenarios:

  1. No Denodo Monitor running.
  2. The Denodo Monitor running from the Solution Manager and writing to an Oracle DB.
  3. The Denodo Monitor running from the Solution Manager and writing to log files only.

 Results were as follows:

Scale Factor

Overhead when monitor writing to DB

Overhead when monitor writing to files

Combined Overhead

TPC-H 20

0.65%

1.55%

1.10%

TPC-H 1

0.54%

1.53%

1.04%

The data in each case was computed based on an average of multiple iterations. Across 3 iterations of TPC-H 20, reported overhead ranged from 0.42% to 0.83% in the case of writing to a DB and 0.52% to 2.24% in the case of writing to flat files. Meanwhile, recorded overhead had much higher variance in the case of TPC-H 1 likely due to the shorter query duration. In this case, calculated overhead ranged from -2.8% to 3.4% for iterations when writing to a DB and from -2.4% to 4.4% when writing to flat files across a total of 10 iterations. In both cases though, as can be seen in the chart above averaging across multiple iterations resulted in fairly consistent results.

What is important to note is that in the case where the Denodo Monitor is writing to a DB it is also continuing to produce flat files. It is interesting that in the tests of both scale factors, average overhead actually came out lower when in the case that the monitor was writing to a DB in addition to writing out the flat files. We find it unlikely that the addition of the DB writer is reducing overhead in any way, so it is most likely that the difference between the two is largely caused by factors beyond the control of the tests (e.g. temporary network differences or similar). With that said, what we feel can be reasonably taken from these results is that writing to a DB does not appear to produce any perceivable additional overhead beyond simply writing to a log file. In addition, the results show that a small overhead is produced when using the Denodo Monitor from the Solution Manager, 1.10% on average in the case of our TPC-H scale factor 20 tests and 1.04% in the case of our scale factor 1 tests.

Conclusions

Based on the results, the overhead added by running the Denodo Monitor from the Solution Manager is fairly minimal in both analytical and operational scenarios. Having the monitor write to a database does not add significant additional overhead and carries a number of benefits.

With these results in mind, and since having monitoring information available is crucial in many scenarios (auditing, capacity assessments, debugging, input for AI recommendations), we strongly encourage having the Denodo Monitor running perpetually, especially in production servers.

Questions

Ask a question

You must sign in to ask a question. If you do not have an account, you can register here