Auto Scaling Guide¶
This guide provides an overview of auto-scaling capabilities within Microsoft Azure, focusing on how to identify the Virtual Machine Scale Set associated with your Execution Plane, how to configure the Solution Manager and how to effectively manage the dynamic scaling of your virtual machines.
In Azure, Virtual Machine Scale Set (VMSS) allows you to create and manage a group of load-balanced, auto-scaling virtual machines. You define a VMSS, and Azure handles the creation and removal of VM instances based on configured rules. Auto Scale Settings is where you define and manage the scaling behavior for your resources. An auto scale setting includes:
Profiles: A collection of rules and settings that dictate scaling behavior. You can have multiple profiles for different scenarios (e.g., one for weekdays, one for weekends, one for special events).
Rules: Conditions that trigger scale-out (add instances) or scale-in (remove instances) actions. Rules are based on metrics or schedules.
Metrics: Performance counters (e.g., CPU usage, memory usage, HTTP queue length) or custom application metrics collected by Azure Monitor or Application Insights.
Instance Limits: Minimum, maximum, and default number of instances that the resource can scale to.
How to Deploy a Cluster with Virtual Machine Scale Set¶
There are two modes in which the Agora Execution Plane servers can be deployed using the Solution Manager: automated, using predefined templates or manually, through the Manage Environments option. When deploying with PoC and Production templates, Agora automatically creates a VMSS for Virtual DataPort and the Data Marketplace. When deploying manually, the Launch Instances in a Scale Set check box can be selected in the Servers configuration section.
Having elected to deploy the Agora server instances in a VMSS, an appropriate auto scaling strategy can be chosen:
Basic: In this case, the Solution Manager will create and manage a VMSS defined with a fixed number of instances. This strategy guarantees a new instance will be automatically created and added to the group should an instance go down.
Custom Azure Scale Set: The default option. In this case, the Solution Manager creates a basic scale set, which can be configured further to meet specific user requirements in the Azure portal.
How to Identify the Name of the Scale Set¶
In the overview section of the Solution Manager, the cluster summary can be
viewed by selecting the button. The summary includes tabs for the Virtual DataPort (VDP) and Data Marketplace
displaying server details including the name of the assigned Scale Set, as in the example below.
Example of Automatic Scaling Configuration¶
The following two examples describe the steps required to configure automatic scaling strategies with the following objectives:
To stop all instances on weekends.
Increase capacity dynamically during peak workloads. Ensure the combined CPU usage for the instances under the Virtual DataPort Scale Set should be approximately 70%.
Virtual Machine Scale Sets can be configured in the Azure portal. The Virtual DataPort VMSS’s name (identified as described in the previous section), can be used to locate and edit its details in the Azure portal.
The Azure documentation provides more detailed explanation of the configuration of automatic scaling rules.
A. Stopping All Instances on Weekends¶
To save costs, you can configure to stop all instances on weekends when no processing is required. A schedule-based scaling strategy is used to stop the Virtual Data Port (VDP) instances and restart them at the appropriate time.
This process involves defining a profile within your Virtual Machine Scale Set autoscale settings that dictates the scaling behavior during specific times or days.
Configuration steps:
- Navigate to your Virtual Machine Scale Set in the Azure portal. Find and click on the specific one you want to
configure.
In the left-hand menu, under the Availability + scale section, click on Scaling.
If not selected, change the Choose how to scale your resource option to Custom autoscale.
You’ll see a Default scale condition already present. This condition applies when no other scheduled rules are active.
To add a schedule-based rule, click on Add a scale condition. A new profile will appear.
Configure the Schedule Profile:
Give your new condition a descriptive profile name, e.g., Weekend Scale Down.
In Scale mode choose Scale to a specific instance count for a fixed number of VM instances during the schedule.
In Instance Count enter the desired number of VM instances for this scheduled period:
0.Schedule is the crucial part for schedule-based scaling. Choose Repeat specific days and check the specific days you want this rule to be active: Saturday and Sunday.
Select the appropriate Timezone. It’s important to match this to your operational hours.
Define the Start time and End time for when this condition should be active on the selected days/dates. Set the Start time to
00:00on Saturday and the End time to23:59on Sunday.
Ensure your Default scale condition is configured correctly. This condition will be active whenever none of your specific scheduled conditions are met.
Make sure your scheduled profiles don’t overlap, as this can lead to unpredictable behavior. Azure will prioritize conditions with more specific schedules.
Once configured the scale condition, click the Save button at the top of the Scaling blade.
B. Increasing Capacity Dynamically During Peak Workloads¶
This is a common scenario for applications with fluctuating workloads, ensuring performance is maintained during peak periods: scale out by 1 instance if average CPU utilization across all instances exceeds 70% for 5 minutes. Scale in by 1 instance if average CPU utilization drops below 30% for 10 minutes.
Configuration Steps in the Azure Portal for a VM Scale Set:
Navigate to your Virtual Machine Scale Set in the Azure portal. Find and click on the specific one you want to configure.
In the left-hand menu, under the Availability + scale section, click on Scaling.
If not selected, change the Choose how to scale your resource option to Custom autoscale.
You’ll see a Default scale condition already present. This condition applies when no other scheduled rules are active.
Click on Add a scale condition. A new profile will appear. It is also possible to modify this Default profile.
Configure the profile:
Give your new condition a descriptive profile name.
In Scale mode choose Scale based on a metric.
Set the Instance limits (e.g., Minimum: 1, Maximum: 10, Default: 2).
Add a new scale-out rule:
Metric source: Current scale set
Metric name: Percentage CPU
Operator: Greater than
Metric threshold to trigger scale action: 70
Duration (minutes): 5
Time grain statistic: Average
Time aggregation: Average
Operation: Increase count by
Cool-down (minutes): 5
Instance count: 1
Click the Add button.
Add a new scale-in rule:
Metric source: Current scale set
Metric name: Percentage CPU
Operator: Less than
Metric threshold to trigger scale action: 30
Duration (minutes): 10
Time grain statistic: Average
Time aggregation: Average
Operation: Decrease count by
Cool-down (minutes): 10
Instance count: 1
Click the Add button.
Click the Save button at the top of the Scaling blade.
Best Practices for Azure Auto Scaling¶
Define clear minimum and maximum instance counts, always set appropriate
minandmaxlimits to prevent over-provisioning or under-provisioning. Ensure there’s enough margin between them for scaling actions to occur.Implement both scale-out and scale-in rules for effective cost management and performance, always have rules for both increasing and decreasing instances.
Use the same metric for scale-out and scale-in to avoid flapping (rapid, unnecessary scaling), it’s generally best to use the same metric (e.g., CPU) for both scale-out and scale-in rules.
Choose appropriate aggregation methods for metrics like CPU,
Averageis often the most suitable aggregation method to trigger scaling actions based on the overall load.Set realistic cool-down Periods to allow enough time for new instances to start and stabilize, or for load to decrease, before triggering subsequent scaling actions.
Test your scaling rules performing load testing to simulate different traffic patterns and ensure your auto-scaling rules behave as expected.
