Applies to:
Denodo 7.0
Last modified on: 29 Oct 2019
Tags:
AWS
Administration
Cloud
Cluster configuration
Solution Manager
In cloud environments (like AWS), a typical use case is to use auto scaling capabilities in order to allow client applications to increase / decrease servers capacity according to a different set of rules (time intervals, CPU, memory usage…). This ability to automatically adjust the capacity of a Denodo cluster in AWS require taking into account several aspects:
You can find detailed information about the configuration process in the How to store Denodo logs in Amazon S3 document.
Notice that Denodo Monitor Logs are not affected by this problem because Denodo Monitor generates the log files in the <SOLUTION_MANAGER_HOME> folder.
In addition, as explained in the Monitoring Denodo with Amazon CloudWatch document, it is possible to expose metrics of Denodo to AWS Cloudwatch so they can be used when configuring auto scaling actions and policies. Therefore, this document also provides examples of how to configure an auto scaling group of Denodo instances in two scenarios: using a scheduled action and using a Denodo custom metric.
The following image illustrates the auto scaling lifecycle in AWS when instances are dynamically launched or stopped:
It is possible to add a lifecycle hook when an instance is launched or terminated, in order to execute an operation / execute some code using a lambda function.
When a new instance is launched it is necessary to register the new server in the Solution Manager catalog, so the Virtual DataPort server can get a working license and start correctly.
Ideally, the servers could register themselves at startup time to avoid the manual registration. This could be done:
In both cases, the script / code executed will need to:
When an auto scaling group terminates an instance, the instance is killed (it is not a normal shutdown), so the server might not be stopped normally and the license usage not released. It is necessary to free the license usage for the server and delete the server from Solution Manager catalog. These operations can be executed in a terminate instance lifecycle hook.
In order to perform the desired operations, a lambda function will execute the following operations:
In this section we describe how to configure the scale-in process. The sequence of steps we will follow is:
A deployment package is a ZIP archive that contains function code and dependencies. The deployment package of the lambda function contains:
You can download the deployment package of this lambda function here: Terminate Instance Package
Unzip the deployment package in a folder.
Edit the TerminateInstanceConfiguration.properties file to configure the necessary data to access the Solution Manager. For instance:
com.denodo.sm.host=localhost com.denodo.sm.port=10090 com.denodo.sm.user=username com.denodo.sm.password=clearOrEncriptedPassword com.denodo.sm.sslEnabled=false |
Where:
When the configuration is ready, zip all the files inside the deployment package folder.
Now we need to create the lambda function in AWS and import the deployment package.
The documentation regarding lambda functions is available here.
In order to work with lambda functions there are several approaches. To test simple scripts, the easiest way is to use the lambda console :
Once the basic lambda function is created, we will update it with zip file that we created in the previous step.
In order to import the deployment package of the lambda function, select the option “Upload a .zip file” and then select the zip file in your machine.
Update Handler to terminate_instance.lambda_handler
Press the “Save” button in order to upload the function in the zip file.
The script defines a “lambda_handler” function to handle termination events for instances.
You can test that the lambda function works correctly and has access to the Solution Manager server. Paste the following json in the event content:
{ "detail": { "EC2InstanceId": "i-xxxxxxxxxxxxx" } } |
We will use CloudWatch events to invoke the lambda function every time the auto-scaling group terminates an instance.
You can read more about the lifecycle hooks and notifications possibilities here.
In order to create and configure a CloudWatch event to invoke the desired lambda function during instance termination, you can follow the next steps:
Alternatively, we could also create the event editing the “Event Pattern” textarea with the following JSON (changing the AutoScalingGroupName attribute to the name of the corresponding auto scaling group):
{ "source": [ "aws.autoscaling" ], "detail-type": [ "EC2 Instance Terminate Successful", "EC2 Instance Terminate Unsuccessful", "EC2 Instance-terminate Lifecycle Action" ], "detail": { "AutoScalingGroupName": [ "ASG-LambdaTest" ] } } |
Once we have the lambda function and the CloudWatch event created, we need to add the lifecycle hook to the auto scaling group for the terminating instance phase.
In order to create the Lifecycle Hook, go to the “Lifecycle Hooks” tab inside the auto scaling group and click “Create Lifecycle Hook”.
Fill the information with the configuration you want:
The Heartbeat timeout in a termination hook is the time that the instance remains in the Terminating:Wait state of the cycle. We recommend to change it to a lower value, about 60 seconds (default is 3600 seconds) than the default one to proceed with the termination process.
In this section we describe how to configure the scale-out process. The sequence of steps we will follow is similar to the previous section:
In the same way as the terminate instance, there is a deployment package for the register lambda function. This package contains:
You can download the deployment package of this lambda function here:
Unzip the deployment package in a folder.
Edit the “ServerData.json” file to configure the necessary data to access the Solution Manager and server default values:
The “register_autoscaling_server.py” script registers a server in a cluster in the Solution Manager catalog. The script receives a configuration file as an argument. This is an example configuration file:
{ "com.denodo.sm.user" : "admin", "com.denodo.sm.password" : "clearOrEncryptedPassword", "com.denodo.sm.host" : "localhost", "com.denodo.sm.port" : 10090, "clusterId" : 2, "defaultDatabase" : "admin", "username" : "admin", "password" : "encryptedPassword", "port" : 9999, "useKerberos" : false, "usePassThrough" : false, "solutionManagerUsesSSL" : false } |
Where:
The script registers the server in the solution manager with:
The example assumes a scenario where the instances run in a private subnet (without public ips and unreachable from the internet). The Solution Manager server can be located in:
When the configuration is ready, save the changes and zip all the files inside the deployment package folder.
Now we need to create the lambda function in AWS and import the deployment package.
Perform the same steps as described in the terminate instances section.
Once the lambda function is created correctly, it is necessary to edit the role automatically created for the lambda function, in order to give the lambda function permissions to execute the DescribeInstances API operation invoked during the script execution.
Open the IAM console (you can access directly clicking on the role in the lambda function).
There are two options to add the new permissions:
{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Action": [ "ec2:DescribeInstances" ], "Resource": "*" }] } |
Then, attach the created policy to the role using the “Attach policies” option.
2. Or you can edit the policy using the “Edit policy” option
Add the following statement in the JSON tab to allow DescribeInstances operation
{ "Effect": "Allow", "Action": [ "ec2:DescribeInstances" ], "Resource": "*" } |
Click review policy option and save the changes.
Import the lambda function as described before for terminate instances lambda function.
In this case, update ‘Handler’ to register_autoscaling_server.lambda_handler and save.
You can test the lambda function defining a test event and executing the lambda function like in the terminate instances scenario.
We will use CloudWatch events to invoke the lambda function every time the auto scaling group starts a new instance.
You can read more about the lifecycle hooks and notifications possibilities here.
In order to create and configure a CloudWatch event to invoke the desired lambda function during instance launch, you can follow the next steps:
In this case, we could also create the event editing the “Event Pattern” textarea with the following JSON (changing the AutoScalingGroupName attribute to the name of the corresponding auto scaling group):
{ "source": [ "aws.autoscaling" ], "detail-type": [ "EC2 Instance-launch Lifecycle Action" ], "detail": { "AutoScalingGroupName": [ "ASG-LambdaTest" ] } } |
Once we have the lambda function and the CloudWatch event created, we need to add the lifecycle hook to the auto scaling group for the launching instance phase.
In order to create the Lifecycle Hook, go to “Lifecycle Hooks” tab inside the auto scaling group and click “Create Lifecycle Hook”.
Fill the information with the configuration you want:
The Heartbeat timeout in a termination hook is the time that the instance remains in the Pending:Wait state of the cycle. Change it to a lower value, about 30 seconds (default is 3600 seconds).
Some scenarios allow you to set your own scaling schedule due to predictable load changes. For example, you can detach and terminate an instance at night, when the cluster workload is lower and launch a new instance again in the morning.
First of all you may take into account the Auto Scaling group termination policy. It determines which instances to terminate when a scale-in event occurs and it is important because these instances enter the Terminating state and cannot be put back into service.
Let’s imagine that you have an Auto Scaling group in which the number of instances you want to run is 3 and the minimum and maximum number of instances the Auto Scaling group should have at any time is 2 and 3, respectively. After reviewing the considerations mentioned above, you may create two scheduled actions in order to scale your Auto Scaling group on a recurring schedule:
The Desired Capacity field should be set to 2 with the aim of terminating one of the instances, because 2 instances will be sufficient to manage the workload during the night. In order to perform this action every night, the Recurrence must be set to Every day and then the cron expression will be created for you. The Start Time, 23:00 UTC, specifies the earliest time the action is performed.
With this action, every night at 23:00 UTC, a new instance will be terminated in the Test-ASG auto scaling group.
When the workload is expected to return to normal levels, we must again raise the Desired Capacity to 3. In order to perform this action every morning, the Recurrence must be set to Every day and then the cron expression will be created for you. The Start Time, 08:00 UTC, specifies the earliest time the action is performed.
With this action, every morning at 08:00 UTC, a new instance will be launched in the Test-ASG auto scaling group.
Note that the desired capacity must be less than or equal to the maximum size of the group. If your new value for Desired is greater than Max, you must update Max.
CloudWatch allows you to create alarms based on metrics and define actions to be taken when the alarm changes the state. One of these available actions is the Amazon EC2 Auto Scaling action.
Auto Scaling groups should be created specifying the number of instances you want to run. In our example the desired number of instances is 2 and the minimum and maximum number of instances the Auto Scaling group should have at any time is 2 and 3, respectively. In this scenario, we need to configure an Auto Scaling action in response to a high workload that will increase the number of instances and also an Auto Scaling action in response to a low workload that will decrease the number of instances. But note that regardless of the actions, the Auto Scaling group will maintain the minimum and maximum number of instances without further configuration. That is why the execution of an action is going to fail when it tries to scale out but the number of running instances in the Auto Scaling group is 3 (the maximum) or when it tries to scale in and the number of instances is 2 (the minimum).
For example, you can develop a Denodo custom metric in CloudWatch for monitoring the number of requests initiated in Denodo per server every minute and add an alarm that goes to ALARM state when the number of queries in the last 3 minutes is greater than 100. Take into account that it is not available for anomaly detection alarms therefore you have to specify a static threshold in order to add this kind of action.
See the ‘Monitoring Denodo Metrics’ and ‘Adding an alarm’ sections of the Monitoring Denodo with Amazon CloudWatch document for more detailed information about creating Denodo custom metrics and adding alarms based on metrics.
When you are creating or editing an alarm, the second step allows you to configure the actions. Since you are going to use our alarm as an indicator of high workload, we want to launch a new instance in our Auto Scaling group, Test-ASG, whenever the alarm enters in the ALARM state. Similarly, you have to define as an indicator of low workload an alarm in which the ALARM state will enter when the number of requests initiated in Denodo per server every minute in the last 3 minutes is lower than 40.
On the other hand, a scaling policy associated with the Auto Scaling group is necessary to define an Auto Scaling action. Therefore, you must create the alarm, called my_alarm, without any action.
To create a scaling policy you should:
To create a scaling policy in response to the need of decreasing the number of instances, select the alarm created as an indicator of low workload in the Execution policy when field and in the Take the action section choose Remove 1 instances.
Now you can add the Auto Scaling action to my_alarm alarm in CloudWatch editing it.
Note that after an alarm invokes an Amazon EC2 Auto Scaling action due to a change in state, the alarm continues to invoke the action for every period that the alarm remains in the new state. Nevertheless, in the history of the alarm the action appears only once, between the two state updates: