Hands on - Azure Scale Sets

on Regards: Azure; Infrastructure;

Why Azure Scale Sets?

To set the stage, let’s first delve into the context and requirements of our scenario. Our clientele primarily utilizes our application during a specific recurring time frame in the afternoon to streamline their business operations. The application was hosted on-premises on a limited number of static machines.

While it’s possible to horizontally scale the application by incorporating additional machines, this approach has its limitations. It requires significant manual intervention and lacks the flexibility to adapt to fluctuating demand. This is where Azure Scale Sets come into play.

Azure Scale set is a group of identical, scaling VMs when using Orchestration mode Uniform (optimized for large scale stateless workloads with identical instances). Which means, you cannot have in your scale set virtual machines with different hardware specifications. There is also the option flexible: achieve high availability at scale with identical or multiple virtual machine types.

Since the on-premise machines should be decommissioned soon, we decided to migrate the application to Azure. The application has some dependencies to the Windows operating system and is not yet ready to be containerized. This was the main reason why we had to choose Azure Scale Sets over azure Kubernetes.


There are different scaling types:

  • Manual scaling
    • define manual how many VMs you want to have in your scale set
    • (Fig 1. manual scaling)
  • Time based scaling
    • You can define a schedule to scale up or down the number of VMs in the scale set. This is useful if you know the demand for your application will increase or decrease at specific times.
    • (Fig 2. time based scaling)
  • Metric based scaling (CPU, Memory, Network, Disk I/O, etc.)
    • This is provided out of the box. Each VM in the scale set will have a metric agent installed that will send the metrics to Azure Monitor.
  • Log based scaling (Application logs, Event logs, etc.)
    • For this you need to customize your app and write specific logs or metrics, which can be than used to scale the VMs. For example, you could track the number of business cases that needs to be processed and scale the VMs based on that.

Considering the substantial costs associated with running multiple VMs, log-based scaling could be an optimal choice. This is because CPU and memory metrics may not always accurately indicate the necessary number of instances. However, implementing log-based scaling could be quite labor-intensive. As we preferred not to modify the application and knew that our application’s demand would fluctuate at specific times, we opted for time-based scaling.

Application deployment within the scale set

Now the question is, how do we get the application to the virtual machine’s instances in the azure scale set? First you need to find a proper base image for the VMs. We use a base image provided by microsoft with 1GB RAM and 1 CPU as this was sufficient for our application. Base images with more RAM and CPU are also available but they are more expensive. Keep in mind, that the azure web portal does not provide all base images. You can find more base images using the azure cli.

The most straightforward method to install your application onto the VM involves storing the application in a blob storage and downloading it during the VM creation. But how do we achieve this?

We have three alternatives:

  • Azure DSC Extension (Desired State Configuration)
    • This seems to be complex, outdated and not recommended anymore.
  • Azure Custom Script Extension
    • This is really straight forward. From the azure portal you can select from an existing blog storage a powershell script which will be executed during the VM creation. This script can be used to download the application from the blob storage and install it on the VM. Bear in mind that the script will be executed only once during the VM creation. If you want to update the application, you need to create a new VM. Also, if you install a lot of software on the VM, the VM creation will take a lot of time. This brings us to the third option.
  • Custom Image
    • You create your own image with all the software you need. Using the custom image for the VMs your application would be much sooner ready compared to the Azure custom Script Extension approach. Building such an image can take a lot of time and you also would need some staging environment to test the images. Also, you probably need to update the image from time to time. This would not be necessary if you use the provided Microsoft base images as they contain always the latest hot patches.

In our scenario the Azure Custom Script Extension was the best option, as it seems to be straight forward, and we didn’t need to install a lot of software on the VMs. The final solution consisted of a resource group with the following resources:

  • Storage account
    • contain a PowerShell script which is downloading the application as zip file from the blob storage and installing it on the VM
    • the application itself as zip file
  • Azure Scale Set as Managed Identity which can access the storage account using its system assigned identity
  • Application Insights (which is used by the application)

Final Picture

(Fig 3. Architecture Azure Scale Set)
(Fig 3. Architecture Azure Scale Set)

The application which should be hosted in the azure scale set already had a CICD pipeline. The pipeline was extended and modified in a way with the following steps:

  1. build the application
  2. create a zip file containing the application (see image above 2)
  3. create an additional artifact containing the powershell script which is used to setup the VMs (see image above 2)
  4. create and deploy the required azure infrastructure (Azure Scale Set storage account) using IaC (see image above 3)
  5. upload the artifacts from step 2 and 3 to the storage account

IaC (Infrastructure as Code)

Our Azure infrastructure was successfully deployed using a YAML pipeline, specifically utilizing the AzureResourceManagerTemplateDeployment task. This task allows us to define a template, using either an ARM or Bicep file, to create the necessary Azure resources.

As part of this process, we deployed a new subnet within an existing virtual network. This virtual network establishes a connection between Azure and our on-premise network. This newly created subnet is then utilized within the Azure scale set.

Each instance generated by the Azure scale set will automatically adopt an IP address from our defined subnet. Therefore, it’s essential to consider the number of instances you’ll need for your resources upfront, as this will influence the subnet definition.

To enhance flexibility during the Infrastructure as Code (IaC) deployment, we incorporated parameters into the template. These parameters allow for easy adjustments to the Azure resources. For instance, within the parameter section, one can define the subnet range or the base image for the Azure scale set.

Additionally, the IaC template outputs certain values, such as the created resource group, storage account, the URL of blob storage, the container names of the storage account and other data which is required later for uploading for example the artifact to the storage account. Also these outputs can be particularly useful for debugging during the template creation process.

CustomScriptExtension & Storage Account

The deployment of IaC templates is designed to be incremental. This implies that if you modify the IaC template and redeploy it, only the changes will be implemented. However, this process doesn’t always go as planned, and you might encounter a deployment failure with an uninformative error message. In such scenarios, you’ll need to delete the resource group and redeploy the IaC template.

This issue often arises when changes are made to the commandToExecute within the CustomScriptExtension. This commandToExecute is a PowerShell script that is executed during the VM’s creation. Furthermore, alterations to the storage account permissions or Azure scale set properties like osProfile.windowsConfiguration.timezone may require the removal of Azure resources and a redeployment of the IaC template. If these changes are made, the deployment may fail, returning an error message: {"code":"DeploymentFailed","message":"At least one resource deployment operation failed. Please list deployment operations for details.","details":[{"code":"PropertyChangeNotAllowed","target":"windowsConfiguration.timeZone","message":"Changing property 'windowsConfiguration.timeZone' is not allowed."}]}.

Now, let’s return to the discussion on how the CustomScriptExtension and the storage account interact.

To begin with, we’ve configured the storage account and its container to restrict access solely to Azure Managed Identities with the Storage Blob Data Reader role. This ensures that the hosted files are secure and inaccessible to unauthorized entities.

The Azure scale set, equipped with a system-assigned identity, is granted access to the storage account via the Storage Blob Data Reader role. With this setup in place, we can attempt to download the artifacts from the storage account using a VM within the Azure scale set for testing purposes.

Our ultimate goal is to download these artifacts, which include the PowerShell install script, during the VM creation process.

In theory, the CustomScriptExtension should support the direct downloading of artifacts from a storage account. An example of this can be found here. However, I suspect that our storage account’s restriction policy might have hindered this functionality. As a result, I implemented a workaround, which involves acquiring an access token from the Azure scale set to access the storage account.

For guidance on how to do this, refer to “Acquire an access token”. This process is similar to the token acquisition for App Registrations. However, I tried to simplify the process of acquiring the access token, downloading the install script, and executing it on the VM into a single command. This was necessary because the commandToExecute in the CustomScriptExtension is designed to execute only one command.

Configuring the Azure Scale Set Instances

Each time a new instance is initiated, a virtual machine (VM) is created using the base image specified in the Azure scale set (for instance, a datacenter-azure-edition-core-smalldisk image). Thanks to the CustomScriptExtension, the Windows image includes a Windows service installed in (C:\Packages\Plugins\Microsoft.Compute.CustomScriptExtension\). This service executes the commandToExecute of the CustomScriptExtension. In our case we should find the script at C:\Packages\Plugins\Microsoft.Compute.CustomScriptExtension\1.9.5\Downloads\1\{YourPowershellInstallScriptOnVmCreation}.ps1. You can verify this by connecting to the VM via Remote Desktop Protocol (RDP).

The 1.status file, located in C:\Packages\Plugins\Microsoft.Compute.CustomScriptExtension\1.9.5\Status, should contain the output of the {YourPowershellInstallScriptOnVmCreation}.ps1 script when executed.

The logs of the CustomScriptExtension can be found in C:\WindowsAzure\Logs\Plugins\Microsoft.Compute.CustomScriptExtension\1.9.5\.

The powershell script YourPowershellInstallScriptOnVmCreation.ps1 itself is downloading the application from the storage account and installing it on the VM.

Please check blog for more information about the CustomScriptExtension.

Uploading the artifacts to the storage account

Once the Infrastructure as Code (IaC) template is deployed, the next step is to upload the artifacts (the application and PowerShell script) to the storage account. To do this, we first need the storage account’s URL. Fortunately, we’ve already included the resource group name and storage account name, among other details, in the output of the IaC template deployment.

In the Azure pipeline, we can access the output of the IaC template deployment using the $(ArmOut_rgName) syntax. This can be accomplished using the Azure CLI. The script below demonstrates how to upload the artifacts to the storage account:

Job: DeployIaC

Job: UploadScripts

Upgrade policy

The upgrade policy defines how the VMs in the scale set are updated. There are three options:

  • Manual
  • Automatic
  • Rolling

At present, we employ the Manual upgrade policy as our VMs are created daily and operate within a specific timeframe. After this timeframe, all VMs are removed. Consequently, new VMs are recreated, and the application is reinstalled the following day. This approach guarantees that the application remains updated.

If your application operates continuously and frequently receives updates, you might want to consider an upgrade policy other than Manual. In the azure scale set select Instances to check the state of the VMs.

(Fig 4. instances)
(Fig 4. instances)

If you see a warning sign in the column Latest model you would need to reimage the selected instance to have the VM in the latest state with the latest application version.

However when uploading a new version of your application to the storage account, within the azure pipeline you would need to reimage all VMs in the scale set. This can be done using the azure cli: az vmss update-instances


In conclusion, Azure Scale Sets provide a powerful and flexible solution for managing and scaling applications. By leveraging features like time-based scaling, Custom Script Extensions, and Managed Identities, we were able to create a system that adapts to our application’s specific needs and usage patterns. Remember, the choice of upgrade policy and scaling type should align with your application’s requirements and operational patterns. I’ll be sure to update this article as we continue to explore Azure Scale Sets and their capabilities.