You can translate the question and the replies:

Same query, different results

I have a derived query that joins data from Teradata and AWS Athena sources. Both data sources are updated nightly. The derived view takes about 20minutes to complete. I noticed that the same query returned different results when executed back to back. The Teradata view has about 3K records and the result set of the derived view has a couple million records. To test, I selected two records from the primary Teradata view and repeatedly executed the query. For three of the ~100 columns in the result set, the query returned either the expected values or NULL. Those three values come a lookup table in Teradata (a 6.7M records customer address view). When I looked at the execution plan, it said "completed = true", "memoryLimitReached = true", and "swapping = false" (I'm not sure if this happened to be taken from an execution with or without the correct values). How is it that a query can complete without errors, yet produce different results? I have read all of the available documentation and observed a lot of instructions for changing settings, but little on what those settings actually mean. I don't see anything to explain what I am seeing.
22-11-2021 17:08:34 -0500

7 Answers

Hi, Going through the question, I could see the parameter **memoryLimitReached** parameter is set as **true**. This means that the intermediate results have reached their maximum memory consumption and thus waits until the memory is available to processes the next set of records from the source. In order to resolve the maximum memory reached issue, you could follow the below steps: * Increase the maximum memory to be used for the query or intermediate results by navigating to the 'Server Configuration > Memory Usage' tab in the Virtual DataPort Administration Tool. * Modify the value for 'Maximum size in the memory of intermediate results (MB)' option to be greater than the default value with respect to your query results. * Save the changes and execute the query. Additionally, you could also test enabling the [Automatic Simplification of Queries]( for the query using the [CONTEXT]( clause like CONTEXT( 'simplify' = 'on'). Hope this helps!
Denodo Team
23-11-2021 04:26:01 -0500
Thank you. I have a few more questions. In the execution plan, what is meant by "completed = true"? If an intermediate step stops prematurely due to the memory limit having been reached, I would expect it some kind of an error. There's no error thrown, so how can a data consumer know if the view actually completed or if it blew pass a memory limit? My Teradata sources are views (tables) and there are no statistics available. I have been working on using the "Complete missing statistics executing SELECT queries" and calculating just a few columns at a time because the Teradata server throttles me if I do more. Are statistics needed on every column (I have about 170 columns)? If I get statistics for all of the data sources, might Cost Optimization aid my query by finding a less memory-intensive query plan?
23-11-2021 07:53:33 -0500
Hi, For your scenario, you could enable the [Denodo Monitor]( and see that all the memory utilization information available under the directory <Denodo_Home>\tools\monitor\denodo-monitor\logs. Secondly, if you have statistics available in the underlying data source for a view and if you want Denodo to use those statistics for Cost based Optimization, then it is necessary to obtain the statistics from the underlying table and store them in the metadata of the view by enabling "Enable Statistics" option and clicking on "Gather statistics for selected fields". For more information, you could refer the [Statistics of the View]( and [Gathering the Statistics of Views]( sections of Virtual DataPort Administration Guide. Also, You could refer to the [Best Practices to Maximize Performance IV: Detecting Bottlenecks in a Query]( section of for more information. Hope this helps!
Denodo Team
24-11-2021 06:58:54 -0500
To my original question, how is it possible to execute the same query and get different results without there being an error thrown? Everything that I've read indicates that a “memoryLimitReached = true” condition results in a query pausing- not stopping. How would a client application know that the results are suspect without an error being thrown?
29-11-2021 08:16:16 -0500
Hi, For your scenario, the Execution Trace shows that **swapping = false **and **memoryLimitReached = true**. This indicates that the query had reached the memory limit set in the Virtual DataPort Server Configuration, and also that no swapping to disk is allowed. In this case, the performance of the query will become very slow because the Virtual DataPort server is going to have to manage the processing it has to do with the limited amount of memory that the query has. In order to overcome this performance issue, I would navigate to Administration > Server Configuration > Memory Usage and turn the Swapping Status to “On”. If you are a valid Support user, you could raise a Support case at [Denodo Support Site]( so that our Support Team will assist you further. Hope this helps!
Denodo Team
30-11-2021 07:30:41 -0500
You've said that the query will be slow with when the condition of *swapping = false* and *memoryLimitReached = true*, but that's not what I observed. I ran the same query a dozen times and when that condition occurred, it appears that a lookup table wasn't fully loaded and for some rows, the expected data didn't appear. Below I'm showing the results for a single record and a handful of columns that resulted from multiple runs of the same query. The source data gets updated once a night, so I know that the underlying data isn't changing. What I would expect to happen if the memory limit is reached is that the query would pause, not terminte a pull prematurely and continue on as if it had succeeded. I have since adjusted memory limits and 'swapping', but am stuck trying to prove that Denodo can produce consistent results. | time | city_nme | home_zip | state_cde| |------|------|------|------| |3:02 | <null> | <null> | <null>| |3:19 | REDDING | 960010000 | CA| |4:09 | REDDING | 960010000 | CA| |4:28 | REDDING | 960010000 | CA| |4:40 | REDDING | 960010000 | CA| |4:52 | <null> | <null> | <null>| |5:04 | REDDING | 960010000 | CA|
01-12-2021 14:54:31 -0500
Hi, Going through the scenario, the swapping = false and memoryLimitReached = true indicates that the intermediate results have reached their maximum memory consumption set in the Virtual DataPort server configuration, and also that no swapping to disk is allowed. For this, I would check the <DENODO_HOME>\logs\vdp\vdp.log file for more information regarding the issue. Also, I would examine the memory usage settings using the Virtual DataPort Administration Tool. You could refer to [Configuring the Memory Usage and Swapping Policy]( for more information about memory management and swapping particularly in regards to queries with large datasets. Hope this helps!
Denodo Team
07-12-2021 08:05:10 -0500
You must sign in to add an answer. If you do not have an account, you can register here