Automatically Archive Items to S3 Using DynamoDB Time to Live (TTL) with AWS Lambda and Amazon Kinesis Firehose

Earlier this year, Amazon DynamoDB released Time to Live (TTL) functionality, which automatically deletes expired items from your tables, at no additional cost. TTL eliminates the complexity and cost of scanning tables and deleting items that you don’t want to retain, saving you money on provisioned throughput and storage. One AWS customer, TUNE, purged 85 terabytes of stale data and reduced their costs by over $200K per year.

Today, DynamoDB made TTL better with the release of a new CloudWatch metric for tracking the number of items deleted by TTL, which is also viewable for no additional charge. This new metric helps you monitor the rate of TTL deletions to validate that TTL is working as expected. For example, you could set a CloudWatch alarm to fire if too many or too few automated deletes occur, which might indicate an issue in how you set expiration time stamps for your items.

In this post, I’ll walk through an example of a serverless application using TTL to  automate a common database management task: moving old data from your database into archival storage automatically. Archiving old data helps reduce costs and meet regulatory requirements governing data retention or deletion policies. I’ll show how TTL—combined with DynamoDB Streams, AWS Lambda, and Amazon Kinesis Firehose—facilitates archiving data to a low-cost storage service like Amazon S3, a data warehouse like Amazon Redshift, or to Amazon Elasticsearch Service.,Amazon_DynamoDB&linkId=50979940


Evaluating Message Brokers: Kafka vs. Kinesis vs. SQS

In last couple of years, we have observed evolution of several message brokers and queuing services which are all fast, reliable and scalable. While the list is long, in this blog, I will limit the discussion to SQS, Kinesis and Kafka. Simple Queuing Service (SQS) is a fully managed and scalable queuing service on AWS. Kinesis is another service offered by AWS that makes it easy to load and analyze streaming data and also provides the ability to build custom streaming data applications for special requirements. Apache Kafka is a fast, scalable, durable, and fault-tolerant publish-subscribe messaging system which is often used in place of traditional message brokers like JMS and AMQP because of its characteristics like higher throughput, reliability and replication.

While making decisions about which messaging system is right for you, it is important to understand not only the technical differences but also the implications of operational costs both in terms of running them at scale as well as monitoring them. In this blog, I will touch upon our experiences and learning at OpsClarity, based on our evaluation of messaging systems and our migration from SQS to Kafka.