Hello dear Denodo team,
I have a question regarding the performance of a view based on a MongoDB data source.
The base view in question has about 2.7 million records and 256 columns.
Running the query
```
SELECT max(date)
FROM <table>;
```on base view level results in a runtime of about 10 seconds.
On integration view level the same query takes 70 seconds, on business view level more than 2 minutes.
The views only use the data from a single MongoDB table. There are no JOINs, only a few string operations.
I tried several things and consulted a few of the official Denodo articles on optimization.
1. An index was created in MongoDB as well as in Denodo, but did not lead to any improvement.
2. There are few operations on integration views, that dont seem to matter as there is only a few seconds improvement when taking them out.
3. To utilize the Optimizer I already tried setting up PKs and enabling all the optimizations.
4. Enabling View Statistics for the date field barely yielded a 2 second improvement.
We cannot really reduce the column count and a cache is not wanted in this scenario.
Is this performance degradation caused by the sheer size of the table or are there possibilities to improve the performance by further settings?
Best thanks
Update: looking at the execution trace it seems the integration layer and business layer take the to most time. While the base view has an effective time of 456ms, the two following layers (projection) have an ET of >52000 ms. Then there is another group by plan at the end which takes 474ms.
Are there some settings we did not try? The problem seems to be with the two top most layers..