Avoiding Azure Functions Downtime with Slots
// Publicado em: 27 de março de 2020This week I was building a Terraform module to deploy an Azure Functions infrastructure – including Storage Account, Service Plan and the Function App – so later the development team can simply use the func
CLI to deploy they functions themselves.
In the middle of my experiments, I thought about the Function update process: what happens when I change the source code and deploy a new version?
Here I am only exploring the downtime issue. Please refer to this article to learn how to deploy Azure Functions with Terraform._
Test #01 – Terraform Does Everything ™️
In this scenario, the Function App deployed by Terraform points directly to a zip file in an Azure Container, like the code below:
resource "azurerm_function_app" "fn_app" {
name = "my-name"
location = "westeurope"
resource_group_name = "my-rg"
app_service_plan_id = "my-plan"
storage_connection_string = "my-sa-connection-string"
version = "~3"
app_settings = {
https_only = true
FUNCTIONS_WORKER_RUNTIME = "node"
WEBSITE_NODE_DEFAULT_VERSION = "~12"
# the secret sauce!
# points to a zip file in a container
WEBSITE_USE_ZIP = "https://${azurerm_storage_account.mysa.name}.blob.core.windows.net/${azurerm_storage_container.my-ct.name}/my-code-1.0.1.zip${data.azurerm_storage_account_sas.my-sas.sas}"
}
}
From the developer’s perspective, the Function release process looks like this:
- Change code
- Create a zip file with the source code, with a specific name like
my-code-1.0.1.zip
- Upload the zip to an Azure Container
- Change the Terraform code to point to the uploaded file
- Run
terraform apply
Let’s assume the Function exposes a simple HTTP endpoint. If before step 4
we start sending traffic to the function – one request every half second – this is what we get:
This is my code v1.0.1! - 15:49:30
This is my code v1.0.1! - 15:49:31
This is my code v1.0.1! - 15:49:32
This is my code v1.0.1! - 15:49:32
This is my code v1.0.1! - 15:49:33
The service is unavailable. - 15:49:34
This is my code v1.0.2! - 15:49:54
This is my code v1.0.2! - 15:49:55
This is my code v1.0.2! - 15:49:56
This is my code v1.0.2! - 15:49:56
This is my code v1.0.2! - 15:49:57
During the terraform apply
execution we have a ~20 seconds downtime! Not good!
Test #02 – Terraform + func ™️
Alright, the whole zip name changing dance should be causing downtime… func
will solve this! That’s what I though.
In the second scenario, we still create the whole Functions App infrastructure with Terraform, but we omit the WEBSITE_USE_ZIP
variable: instead, the developer will use the func
CLI to deploy the Function itself, pointing to the Function App created by Terraform.
Again, the flow would look like this:
- Deploy infrastructure with Terraform
- Output the Function App name, let’s say
fn-app
- Within the Function repository, use the
func
CLI to deploy, likefunc azure functionapp publish fn-app
- Update the source code
- Deploy again with
func
And, once again during the step 5
, we got almost ~20 seconds of downtime :(
This is my code v1.0.1! - 15:59:20
This is my code v1.0.1! - 15:59:21
This is my code v1.0.1! - 15:59:22
This is my code v1.0.1! - 15:59:22
This is my code v1.0.1! - 15:59:23
The service is unavailable. - 15:59:24
The service is unavailable. - 15:59:25
This is my code v1.0.2! - 15:59:44
This is my code v1.0.2! - 15:59:45
This is my code v1.0.2! - 15:59:46
This is my code v1.0.2! - 15:59:46
This is my code v1.0.2! - 15:59:47
The standard way of developing and deploying Azure Functions – with the
func
CLI – will cause downtime D:
Although twenty seconds may OK for you, it’s a thing to be aware of.
Slots to the Rescue
There is one Azure way to solve it: Slots! Basically, it allows an user to deploy different Functions and hot swap them, without downtime. So one might have a Slot called prod
with with a Function on the version 1.0.1
and another called staging
with a Function named 1.0.2
. Once staging
is tested and ready to go, you swap them, so prod
becomes 1.0.2
.
Test #03 - Using slots
- Create the two previous mentioned slots (you need a Resource Group with a Function App already created):
az functionapp deployment slot create --name fn-app --resource-group rg --slot staging
az functionapp deployment slot create --name fn-app --resource-group rg --slot prod
- Zip your Function folder and deploy it to
prod
(you can only deploy Functions to Slots with a zip file and using theaz
CLI):
az functionapp deployment source config-zip -g rg -n fn-app --src my-code-1.0.1.zip --slot prod
- Change the source code and deploy it to the
staging
– like deploying a new version:
az functionapp deployment source config-zip -g rg -n fn-app --src my-code-1.0.2.zip --slot staging
- Once you are happy with the
staging
version, swap it withprod
. Think about it like promoting the new version:
az webapp deployment slot swap -g rg -n fn-app --slot staging --target-slot prod
If you continuously send requests to the prod
Function during the swap, you’ll see something like this:
This is my code v1.0.1! - 17:10:22
This is my code v1.0.1! - 17:10:23
This is my code v1.0.1! - 17:10:23
This is my code v1.0.1! - 17:10:24
This is my code v1.0.2! - 17:10:24
This is my code v1.0.2! - 17:10:25
This is my code v1.0.1! - 17:10:25
This is my code v1.0.1! - 17:10:25
This is my code v1.0.1! - 17:10:25
This is my code v1.0.2! - 17:10:26
This is my code v1.0.2! - 17:10:27
This is my code v1.0.2! - 17:10:27
This is my code v1.0.1! - 17:10:28
This is my code v1.0.2! - 17:10:28
This is my code v1.0.2! - 17:10:29
This is my code v1.0.1! - 17:10:30
This is my code v1.0.1! - 17:10:31
This is my code v1.0.1! - 17:10:31
This is my code v1.0.2! - 17:10:32
This is my code v1.0.2! - 17:10:32
This is my code v1.0.2! - 17:10:33
This is my code v1.0.2! - 17:10:34
...
The Function is being hot swapped, and traffic is hitting both Functions until the swap is over. DONE! You’ve migrated your function without any downtime ✨
Let’s use Slots everywhere, right? Well, if you can afford, yes. As for now, Slots are only available for the Standard and Premium tiers. The only issue is the price: Functions in the Premium plan are charged by hour instead of execution. Check it out:
So, yeah, as far as I know, Azure Functions with zero downtime is only for the ones who can afford.
@edit: Change to inform that Slots are only available in Standard and Premium tiers.