The premise behind autoscaling in AWS is simple: you can maximize your ability to handle load spikes and minimize costs if you automatically scale your application out based on metrics like CPU or memory utilization. If you need 100 Docker containers to support your load during the day but only 10 when load is lower at night, running 100 containers at all times means that you’re using 900% more capacity than you need every night. With a constant container count, you’re either spending more money than you need to most of the time or your service will likely fall over during a load spike.
In the serverless world, we often get the impression that our applications can scale without limits. With the right design (and enough money), this is theoretically possible. But in reality, many components of our serverless applications DO have limits. Whether these are physical limits, like network throughput or CPU capacity, or soft limits, like AWS Account Limits or third-party API quotas, our serverless applications still need to be able to handle periods of high load. And more importantly, our end users should experience minimal, if any, negative effects when we reach these thresholds.
There are many ways to add resiliency to our serverless applications, but this post is going to focus on dealing specifically with quotas in third-party APIs. We’ll look at how we can use a combination of SQS, CloudWatch Events, and Lambda functions to implement a precisely controlled throttling system. We’ll also discuss how you can implement (almost) guaranteed ordering, state management (for multi-tiered quotas), and how to plan for failure. Let’s get started!
As we approach the end of 2018, I’m incredibly excited to announce that we at Serverless have a small gift for you: You can work with Amazon API Gateway WebSockets in your Serverless Framework applications starting right now.
But before we dive into the how-to, there are some interesting caveats that I want you to be aware of.
First, this is not supported in AWS CloudFormation just yet, though AWS has publicly stated it will be early next year! As such, we decided to implement our initial support as a plugin and keep it out of core until the official AWS CloudFormation support is added.
Second, the configuration syntax should be pretty close, but we make no promises that anything implemented with this will carry forward after core support. And once core support is added with AWS CloudFormation, you will need to recreate your API Gateway resources managed by CloudFormation. This means that any clients using your WebSocket application would need to be repointed, or other DNS would have needed to be in place, to facilitate the cutover.
I recommend you check out my original post for a basic understanding of how WebSockets works at a technical level via connections and callbacks to the Amazon API Gateway connections management API.
With all that out of the way, play with our new presents!
More and more, AWS customers want to make their applications available to globally dispersed users by deploying their application in multiple AWS Regions. These global users expect fast application performance.
In this post, I describe how to use Amazon DynamoDB to power the database of a global backend deployed in multiple AWS Regions. I use DynamoDB global tables, which provide a fully managed, multiregion, and multimaster database so that you can deliver low-latency data access to your users no matter where they are located on the globe.
Why use a multiregion architecture?
AWS customers typically want a multiregion architecture for two reasons:
- To provide low latency and improve their app experience.
- To facilitate disaster recovery.
One of the most common pains for users of AWS Lambda is cold starts. Cold starts add unwanted delays to Lambda invocations, and in cases where a Lambda is used inside of a Virtual Private Cloud (VPC), the latency can be as high as several seconds. This practically negates the speed benefits of Lambda functions.
Fortunately, the Lambda team announced at AWS re:Invent 2018 that they are changing the architecture of Lambdas running in a VPC in order to reduce this latency and make Lambdas start much faster.
Have you considered introducing anomaly detection technology to your business? Anomaly detection is a technique used to identify rare items, events, or observations which raise suspicion by differing significantly from the majority of the data you are analyzing. The applications of anomaly detection are wide-ranging including the detection of abnormal purchases or cyber intrusions in banking, spotting a malignant tumor in an MRI scan, identifying fraudulent insurance claims, finding unusual machine behavior in manufacturing, and even detecting strange patterns in network traffic that could signal an intrusion.
There are many commercial products to do this, but you can easily implement an anomaly detection system by using Amazon SageMaker, AWS Glue, and AWS Lambda. Amazon SageMaker is a fully-managed platform to help you quickly build, train, and deploy machine learning models at any scale. AWS Glue is a fully-managed ETL service that makes it easy for you to prepare your data/model for analytics. AWS Lambda is a well-known a serverless real-time platform. Using these services, your model can be automatically updated with new data, and the new model can be used to alert for anomalies in real time with better accuracy.
In this blog post I’ll describe how you can use AWS Glue to prepare your data and train an anomaly detection model using Amazon SageMaker. For this exercise, I’ll store a sample of the NAB NYC Taxi data in Amazon DynamoDB to be streamed in real time using an AWS Lambda function.
The solution that I describe provides the following benefits:
- You can make the best use of existing resources for anomaly detection. For example, if you have been using Amazon DynamoDB Streams for disaster recovery (DR) or other purposes, you can use the data in that stream for anomaly detection. In addition, stand-by storage usually has low utilization. The data in low awareness can be used for training data.
- You can automatically retrain the model with new data on a regular basis with no user intervention.
- You can make it easy to use the Random Cut Forest built-in Amazon SageMaker algorithm. Amazon SageMaker offers flexible distributed training options that adjust to your specific workflows in a secure and scalable environment.
Serverless architecture is the new kid on the block, and according to a recent surveyby Serverless, Inc., a vast majority of developers will start using it by the end of the year. The serverless paradigm involves running code in the cloud without managing any servers, allowing you to build business logic and create value without ever thinking about the infrastructure or underlying software. Essentially, it lets you focus on your code.
Serverless does not only cover AWS Lambda and other FaaS providers, but basically everything you can use to run code, host files, and store images and data. This means that you, as an engineer, don’t need to manage, scale, or operate any servers whatsoever. And here’s the icing on the cake: you only pay for the time your code is running!
Although serverless offers many benefits, there are still some pitfalls, such as latency. In this article, we’ll discuss how to minimize latency in AWS Lambda. This dreaded phenomenon is caused by cold starts, which are, by definition, slower initial responses from your serverless APIs.
Before we begin, let’s dig deeper into what FaaS is and how it works.