Reverse engineering AWS Lambda

So I have been spending some time jamming my hands into AWS Lambda’s greasy internals, and I’d like to share all the wonderful details I’ve discovered.

why though?

I’ve use AWS Lambda quite extensively at work. And I wanted to get a better understanding of its inner working. What prompted this, you might ask?

Unofficial Native Go Runtime for Google Cloud Functions

There was an off handed comment by the author about the “Lambda API being a bit more complex.”

Well I aim to find out just how complex it is, with the end goal of writing a custom runtime, similar to the one above.

Probably in Python, just because it’s quick to prototype with.

Lets get started, shall we?


For the impatient of you, if you just want to see the results, feel free to look at the code here.

A Decade of Dynamo: Powering the next wave of high-performance, internet-scale applications

Today marks the 10 year anniversary of Amazon’s Dynamo whitepaper, a milestone that made me reflect on how much innovation has occurred in the area of databases over the last decade and a good reminder on why taking a customer obsessed approach to solving hard problems can have lasting impact beyond your original expectations.

“Dynamo: Amazon’s Highly Available Key-value Store” received the ACM SIGOPS 2017 Hall of Fame Award

S3 Bucket Security: More Than ACLs and Policies

Many companies are suffering data breaches because attackers gain access to data in AWS S3 buckets. I don’t want to repeat all the news articles outlining all the S3 data breaches. A Google search will give many examples, and it seems like by the time I write this another one will be in the news. Instead, I’d like to jump to why these S3 bucket breaches are happening and how to securely store data in an S3 bucket.

How Reuters Replaced WebSockets with Amazon Cognito and SQS

The advantages of a serverless architecture are, at this point, not really a matter of debate. The question for every application or component becomes, “How can I avoid having to manage servers?” Sometimes you come across a roadblock: Perhaps you need a GPU; it takes 60 seconds just to load a machine learning model; maybe your task takes longer than the 300 seconds Amazon gives you for a Lambda process and you can’t figure out how to chop it up. The excuses never end.

Perhaps you want to push events into a browser or app through a WebSocket to create something similar to a chat or email application. You could use Nginx and Redis to create topics and have applications subscribe to them via a push stream; however, that means managing some long-running processes and servers. You can fake it and pound your backend once a second, butBut Amazon SQS and Cognito offer an easier way. Each user session can be paired with a Cognito identity and an SQS queue meaning applications can use SQS long-polling to receive events in real-time. At Reuters, we use this in production to support messaging in event-driven web applications and have open-sourced the underlying Serverless stack.

Amazon DynamoDB Accelerator (DAX) – In-Memory Caching for Read-Intensive Workloads

I’m fairly sure that you already know about Amazon DynamoDB. As you probably know, it is a managed NoSQL database that scales to accommodate as much table space, read capacity, and write capacity as you need. With response times measured in single-digit milliseconds, our customers are using DynamoDB for many types of applications including adtech, IoT, gaming, media, online learning, travel, e-commerce, and finance. Some of these customers store more than 100 terabytes in a single DynamoDB table and make millions of read or write requests per second. The Amazon retail site relies on DynamoDB and uses it to withstand the traffic surges associated with brief, high-intensity events such as Black Friday, Cyber Monday, and Prime Day.

While DynamoDB’s ability to deliver fast, consistent performance benefits just about any application and workload, there’s always room to do even better. The business value of some workloads (gaming and adtech come to mind, but there are many others) is driven by low-latency, high-performance database reads. The ability to pull data from DynamoDB as quickly as possible leads to faster & more responsive games or ads that drive the highest click-through rates.

Drawing lessons from the “ Bezos Way”

Amazon’s CEO annual letter to his shareholders is a must-read. Customer focus, decision-making or the importance of writing down important things… Here are my takeaways from Jeff’s latest.

Whatever we think of its founder and CEO, Amazon remains a remarkable example of great management. Since its 1994 start, the company enjoyed steady growth, relentlessly conquering new markets and sectors, coupled to exceptional resilience shown when the company weathered two market crashes (2000 and 2008). In addition, Bezos has demonstrated a consistent ability to convince his board and shareholders to let expansion take precedence over profits and dividends. (No one can complain: thousand dollars invested in Amazon’s 1997 IPO are now worth more than half a million, a 500x multiple).

This didn’t happen without damage. By some measures, Amazon isn’t an enviable place to work and the pressure it applies to its suppliers rivals the iron fist of Walmart’s purchasing department. All things considered, Amazon’s level of corporate toxicity remains reasonable compared to Uber, as an example.

Jeff Bezos is also able to project an ultra-long term vision with his space exploration project for which he personally invests about a billion dollars per year.

Closer to our concerns, he has boosted a respected but doomed news institution — The Washington Post — thanks to a combined investment in journalistic excellence and in technology, two areas left fallow by most publishers.

That is why I thought Bezos’ written addresses to his shareholders (here) are worth some exegesis.

Let start with last week’s letter. (Emphasis mine, and while quotes are lifted from the original documents, some paragraphs have been rearranged for clarity and brevity).

Bezos starts his 2016 missive with a question asked by staffers at all-hands meetings:

“Jeff, what does Day 2 look like? (…) [Bezos reply:] Day 2 is stasis. Followed by irrelevance. Followed by excruciating, painful decline. Followed by death. And that is why it is always Day 1.”

Then he enumerates the three obsessions that make Amazon what it is today:…

My AWS Wishlist for 2017

As a developer working on a 100% serverless application, I find myself wanting more so I can do with less…

Amazon Web Services (AWS) is well known for listening to customer feedback. This has been evident in the features they have delivered for their Serverless platform.

But as a developer working on a 100% serverless application, I find myself wanting more. Unfortunately I can’t fit all my requests into 140 characters. So I decided to write a blog post instead.

New AWS Encryption SDK for Python Simplifies Multiple Master Key Encryption

The AWS Cryptography team is happy to announce a Python implementation of the AWS Encryption SDK. This new SDK helps manage data keys for you, and it simplifies the process of encrypting data under multiple master keys. As a result, this new SDK allows you to focus on the code that drives your business forward. It also provides a framework you can easily extend to ensure that you have a cryptographic library that is configured to match and enforce your standards. The SDK also includes ready-to-use examples. If you are a Java developer, you can refer to this blog post to see specific Java examples for the SDK.

In this blog post, I show you how you can use the AWS Encryption SDK to simplify the process of encrypting data and how to protect your encryption keys in ways that help improve application availability by not tying you to a single region or key management solution.

How does the AWS Encryption SDK help me?

Developers using encryption often face three problems:

  1. How do I correctly generate and use a data key to encrypt data?
  2. How do I protect the data key after it has been used?
  3. How do I store the data key and ciphertext in a portable manner?

The library provided in the AWS Encryption SDK addresses the first problem by implementing the low-level envelope encryption details transparently using the cryptographic provider available in your development environment. The library helps address the second problem by providing intuitive interfaces to let you choose how you want to generate data keys and the master keys or key-encrypting keys that will protect data keys. Developers can then focus on the core of the application they are building instead of on the complexities of encryption. The ciphertext addresses the third problem, as described later in this post.

The AWS Encryption SDK defines a carefully designed and reviewed ciphertext data format that supports multiple secure algorithm combinations (with room for future expansion) and has no limits on the types or algorithms of the master keys. The ciphertext output of clients (created with the SDK) is a single binary blob that contains your encrypted message and one or more copies of the data key, as encrypted by each master key referenced in the encryption request. This single ciphertext data format for envelope-encrypted data makes it easier to ensure the data key has the same durability and availability properties as the encrypted message itself.

The AWS Encryption SDK provides production-ready reference implementations in Java and Python with direct support for key providers such as AWS Key Management Service (KMS). The Java implementation also supports the Java Cryptography Architecture (JCA/JCE) natively, which includes support for AWS CloudHSM and other PKCS #11 devices. The standard ciphertext data format the AWS Encryption SDK defines means that you can use combinations of the Java and Python clients for encryption and decryption as long as they each have access to the key provider that manages the correct master key used to encrypt the data key.

Let’s look at how you can use the AWS Encryption SDK to simplify the process of encrypting data and how to protect your data keys in ways that help improve application availability by not tying you to a single region or key management solution.

CloudWatch Events Now Supports AWS Step Functions as a Target

The Amazon CloudWatch Events service now supports AWS Step Functions state machines as event targets. Amazon CloudWatch Events enables you to respond quickly to application availability issues or configuration changes that might impact performance or security by notifying you of AWS resource changes in near-real-time. You simply write rules to indicate which events are of interest to your application and what automated action to take when a rule matches an event. You can, for example, invoke AWS Lambda functions or notify an Amazon Simple Notification Service (SNS) topic. Now, you can also send the matching events to an AWS Step Functions state machine to start a workflow responding to the event of interest, such as managing copies of Amazon Elastic Block Store (EBS) snapshots upon snapshot completion.

You may also schedule execution of AWS Step Functions state machines at intervals down to 1-minute to automate processes such as synchronizing S3 buckets nightly.

AWS Step Functions is available in the US East (N. Virginia), US East (Ohio), US West (Oregon), EU (Ireland), EU (Frankfurt), and Asia Pacific (Tokyo) regions.

Please visit our website for more information on Amazon CloudWatch Events and AWS Step Functions: