Quantcast
Channel: Andrew Lock | .NET Escapades
Viewing all articles
Browse latest Browse all 743

Using Azure Storage Queue messages with Azure Functions and [QueueTrigger]

$
0
0

In this post I discuss the differences between the Azure Storage Queue message queue service and how it compares to Azure Service Bus in terms of functionality. After comparing the services, I describe when you might want to choose the simpler Storage Queue.

In the second half of the post I show how you can read messages from an Azure Storage Queue using Azure Functions and the [QueueTrigger] queue extension. I describe the various types you can bind to, why I favour QueueMessage, and how to configure a connection string. Finally I describe how the Functions app processes your messages, and how you can customize that behaviour.

Before we get to the Azure Functions, we'll start by looking at Azure Storage Queue: what is it, how does it compare to Azure Service Bus, and when should you use it?

Azure Storage Queue vs. Azure Service Bus

Azure Storage Queue is an Azure service for storing large number of messages which can then be retrieved and processed by HTTP/HTTPS and processed asynchronously. Azure Storage Queue is focused on storing relatively small messages (a maximum of 64KB) but millions of them (up to 500TB!). You can think of this as a simple queue where senders can write messages, and receivers can compete to pull messages off the queue for processing.

Diagram of Azure Storage Queue

The main "competing" service in Azure that performs a similar role is Azure Service Bus, which is an enterprise message broker with message queues (for point-to-point communication, like Azure Storage Queues) and publish-subscribe (pub-sub) topics (for multiple subscribers). The following shows how the pub-sub behaviour works, where each subscriber can see and process all the messages that are sent to a topic:

Diagram of Azure Service Bus with topics and subscriptions

Azure Service Bus also has features such as automatic dead-lettering, duplicate detection, and strict First-In-First-Out (FIFO) ordering guarantees.

One of the bug struggles whenever you have a green-field application is deciding which of the many similar services to choose. For example, Azure has many different messaging services—Storage Queues, Service Bus, Event Grid, Event Hubs, the list goes on! Thankfully the Azure documentation does include comparison pages to help you understand the difference between the various options!

Azure Storage Queue and Azure Service Bus are the main "traditional" message queue products in Azure, so it's worth taking some time to decide which is best for your application. The documentation breaks down all the differences between the two services, but that's a 3000 word document to read and try to understand 😅

For me, I tried to just consider it from a high level. There are some obvious key differences between the two technologies:

Storage QueueService Bus
CommunicationHTTP/HTTPSAMPQ
Pub-Sub supportNoYes (Topics/Subscriptions)
Ordering guaranteesNoYes FIFO
Maximum queue size500 TB80GB
Atomic multi-message transactionsNoYes
Duplicate detectionNoYes

Even from this massively simplified comparison, you can tell that Azure Service Bus has many more capabilities than Azure Storage Queue. Atomic transactions (so you can mark a message as handled and send a new message atomically) and pub-sub support stand out to me; if you need those features from your queue, then the choice is obvious.

However, if you don't need all those "message broker" features, then Azure Storage Queue may be a better option. One thing in Azure Storage Queue's favour is the pricing. For storage queue it's relatively simple (for Azure):

  • You pay for the storage you use (e.g. $0.0462 per GB per month)
  • You pay for the number of API operations you make ($0.04 per million)

For Azure Service Bus, you have to opt in to tiers depending on which features you want, and then how you're billed varies per tier. The "basic" tier is the most similar to Azure Storage Queue (in that it doesn't support transactions/topics etc), but even that likely works out more expensive generally ($0.05 per million operations). So in that case, you might be better off just using Storage Queues for their simplicity.

An obvious Storage Queue use case: processing email status messages

I had a use case that involved handling and processing email status notifications. The result of an "email send" operation would be added to a queue, and I needed my application to process the message and act accordingly (e.g. mark a subscriber as "unsubscribed" if the email bounced, was marked as spam, or couldn't be sent).

This example seemed like a perfect use case for Azure Storage Queue:

  • The order that the messages are processed doesn't matter.
  • Each message only needs to be processed by a single consumer, so no need for pub-sub features.
  • No specific requirements around TTL or other advanced features.

For this use case, the basic features of the Azure Storage Queue are perfectly adequate. Additionally, creating and managing a storage queue is significantly easier and cheaper than Azure Service Bus, so the choice is obvious.

Reading messages from Azure Storage Queue with Azure Functions

I've spent a lot of time talking about Azure Storage Queues in general. In this section we'll look at some code for reading messages from a queue using Azure Functions' storage queue trigger.

All of the examples I show in this post use the isolated-process model instead of the in-process model, as support for the in-process model ends in 2026.

As with most Azure Functions trigger types, you can register a trigger by creating a function, referencing the required NuGet packages, adding a specific attribute to one of your function parameters, and configuring your connection string.

I walk through each of those steps in the following section. I don't discuss creating the Azure Functions project, so if you don't yet have an app, follow the instructions to create one in the documentation using Visual Studio, Visual Studio Code, or the Azure Functions Core Tools command line.

1. Reference the required NuGet packages

To access the queue storage trigger attribute, add a reference to the Microsoft.Azure.Functions.Worker.Extensions.Storage.Queues package, for example using:

dotnet add package Microsoft.Azure.Functions.Worker.Extensions.Storage.Queues

This adds the package to your .csproj project file, which should look something like this:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <AzureFunctionsVersion>v4</AzureFunctionsVersion>
    <Nullable>enable</Nullable>
    <LangVersion>latest</LangVersion>
  </PropertyGroup>

  <ItemGroup>
    <!-- Required packages for the isolated-process functions -->
    <PackageReference Include="Microsoft.Azure.Functions.Worker" Version="1.22.0" />
    <PackageReference Include="Microsoft.Azure.Functions.Worker.Sdk" Version="1.17.2" />
    <!-- 👇Add this to enable storage queue triggers-->
    <PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Storage.Queues" Version="5.4.0" />
  </ItemGroup>

  <ItemGroup>
    <None Update="host.json">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
    </None>
    <None Update="local.settings.json">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      <CopyToPublishDirectory>Never</CopyToPublishDirectory>
    </None>
  </ItemGroup>
</Project>

With this package added, the QueueTrigger attribute becomes available.

2. Create a function and use the [QueueTrigger] attribute

Create an Azure Function, and include a parameter with the [QueueTrigger] attribute. The type of that parameter can be one of several different values, and will give you various different aspects of the message:

  • string. The message content as a string. Use when the message is simple text.
  • byte[]. The raw bytes of the message.
  • BinaryData. The raw bytes of the message, wrapped in a helper type.
  • QueueMessage. The "full" message from the queue, including metadata about the message.
  • Any serializable POCO object. When a queue message contains JSON data, Functions tries to deserialize the JSON data into a plain-old CLR object (POCO) type.

Depending on what the message contains and what you want to do with it, the final two options make the most sense to me. I chose to use the high level QueueMessage, as it includes all the additional metadata for the queue message, but more importantly, it gives more control over the deserialization.

I found this particularly important, as I had multiple messages being sent to the same queue, so I couldn't use a single POCO object and rely on the framework to deserialize it for me.

The following shows an example signature of an Azure Function that receives QueueMessage instances, by decorating the message parameter with the [QueueTrigger] attribute.

[Function("handle-email-report")]  // 👈 The name of the function
public async Task HandleEmailReport([QueueTrigger("my-queue")] QueueMessage message)
                //  The name of the queue to read ☝
{
    // ...
}

In the above example I create an Azure Function called handle-email-report, which reads from the Azure storage queue called my-queue using the "default" Azure connection.

3. Configure the connection string

If you use the [QueueTrigger] as I have above, the Functions app tries to read the environment configuration value called AzureWebJobsStorage. Locally, that means it uses the value stored in local.settings.json at Values.AzureWebJobsStorage. For example, for local development you might have this:

{
  "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "UseDevelopmentStorage=true"
  }
}

If you want to use a different connection, you can specify that using the Connection property of the [QueueTrigger]` attribute. For example, if you use this:

[Function("handle-email-report")]
public async Task HandleEmailReport(
    [QueueTrigger("my-queue", Connection = "MyConnection")] QueueMessage message)
                              // ☝ Specify alternative connection
{
    // ...
}

The functions app will look for a configuration called MyConnection (and also AzureWebJobsMyConnection; I'm not sure what the order of preference is there). For example

{
  "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "UseDevelopmentStorage=true",
    "MyConnection": "my-connection-string"
  }
}

In recent versions of the Storage Queue extensions NuGet package, you can alternatively use a Microsoft Entra ID instead of storing a connection string in the configuration. Unfortunately, I've read the documentation about that several times and I still have no idea what it's talking about, so I'll leave someone else to explain that 😅

Clearly using RBAC is the "right" thing to do rather than storing arbitrary secrets in each of your apps. But my gosh it's so much harder to get working than dropping a connection string into a config variable 🙈

Assuming you have configured your connection correctly, your Azure Function should start receiving messages!

How does the function process your messages?

When it starts up, your Functions app starts polling for and receiving messages, as described in the polling section of the documentation. The Azure Function checks for messages using an exponential back-off algorithm, ranging from 100ms up to 1 minute.

If the Functions app finds multiple messages in the queue, it retrieves a batch of 16 messages, and runs multiple instances of your function concurrently to process them. Once 8 of the functions have completed, the Functions app polls and tries to receive another batch of 16, so a single instance of the app can process up to 24 (16 + 8) messages concurrently.

The Functions app uses a "peek-lock" pattern to retrieve messages. This automatically marks messages as "invisible" when they're retrieved, so they can't be seen by other queue consumers. Then, one of 3 things happens:

  • The function completes successfully. In this case, the message is deleted from the queue.
  • The function execution fails (throws an exception). The message is not deleted, and its visibility is updated to make it visible to consumers again. You can optionally add a delay to this, so that failed messages remain invisible for 30s, for example.
  • The Functions app crashes. The message cannot be updated (because your app crashed!) but the message automatically becomes visible again after 10 minutes. This behaviour is built into Azure Storage Queues and can't be changed by your app.

So if the function completes successfully, the message is deemed to have been "handled", and is deleted from the queue. If your function throws an exception, the message is made visible in the queue again (with an optional delay). The Functions app will retry processing the message a further 4 times (i.e. 5 attempts in total).

If a message fails processing 5 times it's moved to a "poison messages" queue, called <queuename>-poison. You can process the messages added to this queue as you would any other queue. You might want to simply log and discard them for example, or you might want to allow moving them back to the original queue after you've fixed bugs in the original function.

Changing the configuration of the Storage Queue trigger with host.json

Some of the behaviour of the Functions app can be customised by changing properties in your app's host.json file, as described in the documentation. Not everything can be customised, for example the 10 minute visibility timeout when your app crashes is performed by the storage queue itself, so you can't customise it for your app.

The following shows a host.json file that customizes the storage queue trigger. The JSON below shows the default values:

{
  "version": "2.0",
  "extensions": {
    "queues": {
      "maxPollingInterval": "00:01:00",
      "visibilityTimeout" : "00:00:00",
      "maxDequeueCount": 5,
      "batchSize": 16,
      "newBatchThreshold": 8
    }
  }
}

These values control various aspects of the [QueueTrigger] behaviour:

  • maxPollingInterval—The maximum polling interval the Functions app will use. The minimum interval is always 100ms (and maxPollingInterval cannot be shorter than this), but there's no maximum interval.
  • visibilityTimeout—Controls how long a message should remain invisible after a failed execution.
  • maxDequeueCount—The maximum number of times to try to handle a message before it should be placed on the poison queue.
  • batchSize—The maximum number of messages the Functions app should try to retrieve and processes in parallel. The maximum size is 32.
  • newBatchThreshold—When the number of remaining messages being processed from a batch reaches this value, a new batch is retrieved.

These are all the settings you can change to control how you consume Storage Queue messages, but there are many other settings you can change to control overall execution of your Azure Functions app.

Summary

In this post I discussed the differences between the Azure Storage Queue message queue service and how it compares to Azure Service Bus in terms of functionality. After comparing the services, I described an example of processing email bounce notifications as a good candidate for choosing the simpler Storage Queue service.

In the second half of the post I showed how to read messages from an Azure Storage Queue using Azure Functions and the [QueueTrigger] queue extension. Finally I described how the Functions app processes your messages, and how you can customize that behaviour.


Viewing all articles
Browse latest Browse all 743

Trending Articles