Time-series data shows a pattern of change over time. For example, you might have a fleet of Internet of Things (IoT) devices that record environmental data through their sensors, as shown in the following example graph. This data could include temperature, pressure, humidity, and other environmental variables. Because each IoT device tracks these values over regular periods, your backend needs to ingest up to hundreds, thousands, or millions of events every second.
In this blog post, I explain how to optimize Amazon DynamoDB for high-volume, time-series data scenarios. I do this by using a design pattern powered by automation and serverless computing.
More and more, AWS customers want to make their applications available to globally dispersed users by deploying their application in multiple AWS Regions. These global users expect fast application performance.
In this post, I describe how to use Amazon DynamoDB to power the database of a global backend deployed in multiple AWS Regions. I use DynamoDB global tables, which provide a fully managed, multiregion, and multimaster database so that you can deliver low-latency data access to your users no matter where they are located on the globe.
Why use a multiregion architecture?
AWS customers typically want a multiregion architecture for two reasons:
- To provide low latency and improve their app experience.
- To facilitate disaster recovery.
Over the years, customers have used Amazon DynamoDB for lots of different use cases, from building microservices and mobile backends to implementing gaming and Internet of Things (IoT) solutions. For example, Capital One uses DynamoDB to reduce the latency of their mobile applications by moving their mainframe transactions to a serverless architecture. Tinder migrated user data to DynamoDB with zero downtime, to get the scalability they need to support their global user base.
Developers sometimes need to implement business logic that requires multiple, all-or-nothing operations across one or more tables. This requirement can add unnecessary complexity to their implementation. Today, we are making these use cases easier to build on DynamoDB with native support for transactions.
This blog post will cover important considerations and strategies for choosing the right partition key while migrating from a relational database to DynamoDB. This is an important step in the design and building of scalable and reliable applications on top of DynamoDB.
What is a partition key?
DynamoDB supports two types of primary keys:
- Partition key: Also known as a hash key, the partition key is composed of a single attribute. Attributes in DynamoDB are similar in many ways to fields or columns in other database systems.
- Partition key and sort key: Referred to as a composite primary key or hash-range key, this type of key is composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key
This repository provides the
dqlite Go package, which can be used to replicate a SQLite database across a cluster, using the Raft algorithm.
- No external processes needed: dqlite is just a Go library, you link it it to your application exactly like you would with SQLite.
- Replication needs a SQLite patch which is not yet included upstream.
- The Go Raft package from Hashicorp is used internally for replicating the write-ahead log frames of SQLite across all nodes.
How does it compare to rqlite?
The main differences from rqlite are:
- Full support for transactions
- No need for statements to be deterministic (e.g. you can use
- Frame-based replication instead of statement-based replication, this means in dqlite there’s more data flowing between nodes, so expect lower performance. Should not really matter for most use cases.
The Amazon DynamoDB team is back with another useful feature hot on the heels of encryption at rest. At AWS re:Invent 2017 we launched global tables and on-demand backup and restore of your DynamoDB tables and today we’re launching continuous backups with point-in-time recovery (PITR).
You can enable continuous backups with a single click in the AWS Management Console, a simple API call, or with the AWS Command Line Interface (CLI). DynamoDB can back up your data with per-second granularity and restore to any single second from the time PITR was enabled up to the prior 35 days. We built this feature to protect against accidental writes or deletes. If a developer runs a script against production instead of staging or if someone fat-fingers a DeleteItem call, PITR has you covered. We also built it for the scenarios you can’t normally predict. You can still keep your on-demand backups for as long as needed for archival purposes but PITR works as additional insurance against accidental loss of data. Let’s see how this works.
Today marks the 10 year anniversary of Amazon’s Dynamo whitepaper, a milestone that made me reflect on how much innovation has occurred in the area of databases over the last decade and a good reminder on why taking a customer obsessed approach to solving hard problems can have lasting impact beyond your original expectations.
“Dynamo: Amazon’s Highly Available Key-value Store” received the ACM SIGOPS 2017 Hall of Fame Award