Writing a cron job microservice with Serverless and AWS Lambda

We recently had a situation where we needed to create a new cron job to fetch all users from our database who are coming to the end of their trial and insert them into our customer.io database. Cron jobs are easy to write, but difficult to setup. You can edit /etc/crontab on the server; if you’re using heroku you can use their Scheduler; or you can use some implementation of cron in your programming language of choice (e.g. Node.js).

The cron job that we needed to write was unrelated to our application code, so whilst we could have put the functionality in there it seemed like the wrong place. Alternatively we could have put the code onto a new server. This would mean provisioning a new box for something that is only going to run once a day for 10 seconds. This seems very wasteful and expensive.


The Algorithms Behind Probabilistic Programming

We recently introduced our report on probabilistic programming. The accompanying prototype allows you to explore the past and future of the New York residential real estate market.

This post gives a feel for the content in our report by introducing the algorithms and technology that make probabilistic programming possible. We’ll dive even deeper into these algorithms in conversation with the Stan Group Tuesday, February 7 at 1 pm ET/10am PT. Please join us!


How both TCP and Ethernet checksums fail

At Twitter, a team had a unusual failure where corrupt data ended up in memcache. The root cause appears to have been a switch that was corrupting packets. Most packets were being dropped and the throughput was much lower than normal, but some were still making it through. The hypothesis is that occasionally the corrupt packets had valid TCP and Ethernet checksums. One “lucky” packet stored corrupt data in memcache. Even after the switch was replaced, the errors continued until the cache was cleared. [Update 2016-02-12: Root cause found: this also involved a kernel bug!]

I was very excited to hear about this error, because it is a real-world example of something I wrote about seven years ago: The TCP checksum is weak. However, the Ethernet CRC is strong, so how could a corrupt packet pass both checks? The answer is that the Ethernet CRC is recalculated by switches. If the switch corrupts the packet and it has the same TCP checksum, the hardware blindly recalculates a new, valid Ethernet CRC when it goes out.

As Mark Callaghan pointed out, this is a very rare scenario and you should never blame the network without strong evidence. However, it isn’t impossible and others have written about similar incidents. My conclusion is that if you are creating a new network protocol, please append a 4 byte CRC (I suggest CRC32C, implemented in hardware on recent Intel, AMD, and ARM CPUs). An alternative is to use an encryption protocol (e.g. TLS), since they include cryptographic hashes (which fixed a similar incident).

The rest of this article describes the details about how this is possible, mostly so I don’t forget them.


Serverless at re:Invent 2016 – Wrap-up

The re:Invent 2016 conference was an exciting week to be working on serverless at AWS. We announced new features like support for C# and dead letter queues, and launched new application constructs with Lambda such as Lambda@Edge, AWS Greengrass, Amazon Lex, and AWS Step Functions. In addition we also added support for surfacing services built using API Gateway in the AWS marketplace, expanded the capabilities for custom authorizers, and launched a reference developer portal for managing APIs. Catch up on all the great re:Invent launches here.

In addition to the serverless mini-con with deep dive talks and best practices, we also had deep customer talks by folks from Thomson Reuters, Vevo, Expedia, and FINRA. If you weren’t able to attend the mini-con or missed a specific session, here is a quick link to the entire Serverless Mini Conference Playlist. Other interesting sessions from other tracks are listed below.

Individual Sessions from the Mini Conference

Other Interesting Sessions