In the serverless world, we often get the impression that our applications can scale without limits. With the right design (and enough money), this is theoretically possible. But in reality, many components of our serverless applications DO have limits. Whether these are physical limits, like network throughput or CPU capacity, or soft limits, like AWS Account Limits or third-party API quotas, our serverless applications still need to be able to handle periods of high load. And more importantly, our end users should experience minimal, if any, negative effects when we reach these thresholds.
There are many ways to add resiliency to our serverless applications, but this post is going to focus on dealing specifically with quotas in third-party APIs. We’ll look at how we can use a combination of SQS, CloudWatch Events, and Lambda functions to implement a precisely controlled throttling system. We’ll also discuss how you can implement (almost) guaranteed ordering, state management (for multi-tiered quotas), and how to plan for failure. Let’s get started!
We can now use Amazon Simple Queue Service (SQS) to trigger AWS Lambda functions! This is a stellar update with some key functionality that I’ve personally been looking forward to for more than 4 years. I know our customers are excited to take it for a spin so feel free to skip to the walk through section below if you don’t want a trip down memory lane.
SQS was the first service we ever launched with AWS back in 2004, 14 years ago. For some perspective, the largest commercial hard drives in 2004 were around 60GB, PHP 5 came out, Facebook had just launched, the TV show Friends ended, GMail was brand new, and I was still in high school. Looking back, I can see some of the tenets that make AWS what it is today were present even very early on in the development of SQS: fully managed, network accessible, pay-as-you-go, and no minimum commitments. Today, SQS is one of our most popular services used by hundreds of thousands of customers at absolutely massive scales as one of the fundamental building blocks of many applications.
AWS Lambda, by comparison, is a relative new kid on the block having been released at AWS re:Invent in 2014 (I was in the crowd that day!). Lambda is a compute service that lets you run code without provisioning or managing servers and it launched the serverless revolution back in 2014. It has seen immediate adoption across a wide array of use-cases from web and mobile backends to IT policy engines to data processing pipelines. Today, Lambda supports Node.js, Java, Go, C#, and Python runtimes letting customers minimize changes to existing codebases and giving them flexibility to build new ones. Over the past 4 years we’ve added a large number of features and event sources for Lambda making it easier for customers to just get things done. By adding support for SQS to Lambda we’re removing a lot of the undifferentiated heavy lifting of running a polling service or creating an SQS to SNS mapping.
Let’s take a look at how this all works.
I’ve been helping out on a project recently where we’re doing a number of integrations with third-party services. The integration platform is built on AWS Lambda and the Serverless framework.
Aside from the data hygiene questions that you might expect in an integration project like this, one of the first things we’ve run into is a fundamental constraint in productionizing Lambda-based systems. As of today, AWS Lambda has the following limits (among others):
- max of 1000 concurrent executions per region (a soft limit that can be increased), and
- max duration of 5 minutes for a single execution
In last couple of years, we have observed evolution of several message brokers and queuing services which are all fast, reliable and scalable. While the list is long, in this blog, I will limit the discussion to SQS, Kinesis and Kafka. Simple Queuing Service (SQS) is a fully managed and scalable queuing service on AWS. Kinesis is another service offered by AWS that makes it easy to load and analyze streaming data and also provides the ability to build custom streaming data applications for special requirements. Apache Kafka is a fast, scalable, durable, and fault-tolerant publish-subscribe messaging system which is often used in place of traditional message brokers like JMS and AMQP because of its characteristics like higher throughput, reliability and replication.
While making decisions about which messaging system is right for you, it is important to understand not only the technical differences but also the implications of operational costs both in terms of running them at scale as well as monitoring them. In this blog, I will touch upon our experiences and learning at OpsClarity, based on our evaluation of messaging systems and our migration from SQS to Kafka.
The Gold Standard for modern cloud-native applications is a serverless architecture. AWS Lambda allows you to implement scalable and fault tolerant applications without the need of a single virtual machine.
A serverless infrastructure based on AWS Lambda has two key benefits:
- You don’t need to manage a fleet of virtual machines anymore.
- Deploying new versions of your code can be entirely controlled by API calls.
This article shows you how to process asynchronous tasks serverless. Possible use cases are: sending out massive amounts of emails, transcoding video files after upload, or analyzing user behavior. An SQS queue will be used to decouple your microservice from other parts of your system. You’ll learn how to implement the microservice with AWS Lambda.
The advantages of a serverless architecture are, at this point, not really a matter of debate. The question for every application or component becomes, “How can I avoid having to manage servers?” Sometimes you come across a roadblock: Perhaps you need a GPU; it takes 60 seconds just to load a machine learning model; maybe your task takes longer than the 300 seconds Amazon gives you for a Lambda process and you can’t figure out how to chop it up. The excuses never end.
Perhaps you want to push events into a browser or app through a WebSocket to create something similar to a chat or email application. You could use Nginx and Redis to create topics and have applications subscribe to them via a push stream; however, that means managing some long-running processes and servers. You can fake it and pound your backend once a second, butBut Amazon SQS and Cognito offer an easier way. Each user session can be paired with a Cognito identity and an SQS queue meaning applications can use SQS long-polling to receive events in real-time. At Reuters, we use this in production to support messaging in event-driven web applications and have open-sourced the underlying Serverless stack.
By default, a failed Lambda function invoked asynchronously is retried twice, and then the event is discarded. Using Dead Letter Queues (DLQ), you can indicate to Lambda that unprocessed events should be sent to an Amazon SQS queue or Amazon SNS topic instead, where you can take further action.
You configure a DLQ by specifying a target Amazon Resource Name (ARN) on a Lambda function’s
DeadLetterConfig parameter of an Amazon SNS topic or an Amazon SQS queue where you want the event payload delivered, as shown in the following code. For more information about creating an Amazon SNS topic, see Create an SNS Topic. For more information about creating an Amazon SQS queue, see Tutorial: Creating an Amazon SQS Queue.