A Decade of Dynamo: Powering the next wave of high-performance, internet-scale applications

Today marks the 10 year anniversary of Amazon’s Dynamo whitepaper, a milestone that made me reflect on how much innovation has occurred in the area of databases over the last decade and a good reminder on why taking a customer obsessed approach to solving hard problems can have lasting impact beyond your original expectations.

“Dynamo: Amazon’s Highly Available Key-value Store” received the ACM SIGOPS 2017 Hall of Fame Award

http://www.allthingsdistributed.com/2017/10/a-decade-of-dynamo.html

Advertisements

Introducing Cloud Firestore: Our New Document Database for Apps

Today we’re excited to launch Cloud Firestore, a fully-managed NoSQL document database for mobile and web app development. It’s designed to easily store and sync app data at global scale, and it’s now available in beta.

Key features of Cloud Firestore include:

  • Documents and collections with powerful querying
  • iOS, Android, and Web SDKs with offline data access
  • Real-time data synchronization
  • Automatic, multi-region data replication with strong consistency
  • Node, Python, Go, and Java server SDKs

And of course, we’ve aimed for the simplicity and ease-of-use that is always top priority for Firebase, while still making sure that Cloud Firestore can scale to power even the largest apps.

https://firebase.googleblog.com/2017/10/introducing-cloud-firestore.html

Pandas and Python Real World Project (GPS data)

“To celebrate my sub-30 minute finish on the manitou springs incline, I figured I’d drop some python and pandas knowledge while analyzing the data using beautifulsoup, pandas, python dateutil, python googlemaps, python geopy and the standard library.

I try to look at things in a non-trivial way and reflective of actual problems you encounter analyzing real world data. The data file is here https://gist.github.com/jrjames83/4de9d124e5f43a61be9cb2a… come code along!”

https://www.youtube.com/playlist?list=PLImyDqSBQbdmicMPRW0Yo5QHbeOIwC765

Use all the Databases

Ever wanted to use a few different databases to build your app? Different types of databases are meant for different purposes, so it often makes sense to combine them. You might be hesitant due to the complexity of maintenance and coding, but it can be easy if you combine Compose and GraphQL: instead of writing a number of complex REST endpoints, each querying multiple databases, you set up a single GraphQL endpoint that provides whatever data the client wants using your simple data fetching functions.

This tutorial is meant for anyone who provides or fetches data, whether it’s a backend dev writing an API (in any language) or a frontend web or mobile dev fetching data from the server. We’ll learn about the GraphQL specification, set up a GraphQL server, and fetch data from five different data sources. The code is in Javascript, but you’ll still get a good idea of GraphQL without knowing the language.

https://www.compose.com/articles/use-all-the-databases-part-1/

Amazon DynamoDB Accelerator (DAX) – In-Memory Caching for Read-Intensive Workloads

I’m fairly sure that you already know about Amazon DynamoDB. As you probably know, it is a managed NoSQL database that scales to accommodate as much table space, read capacity, and write capacity as you need. With response times measured in single-digit milliseconds, our customers are using DynamoDB for many types of applications including adtech, IoT, gaming, media, online learning, travel, e-commerce, and finance. Some of these customers store more than 100 terabytes in a single DynamoDB table and make millions of read or write requests per second. The Amazon retail site relies on DynamoDB and uses it to withstand the traffic surges associated with brief, high-intensity events such as Black Friday, Cyber Monday, and Prime Day.

While DynamoDB’s ability to deliver fast, consistent performance benefits just about any application and workload, there’s always room to do even better. The business value of some workloads (gaming and adtech come to mind, but there are many others) is driven by low-latency, high-performance database reads. The ability to pull data from DynamoDB as quickly as possible leads to faster & more responsive games or ads that drive the highest click-through rates.

 https://aws.amazon.com/blogs/aws/amazon-dynamodb-accelerator-dax-in-memory-caching-for-read-intensive-workloads

Cache eviction: when are randomized algorithms better than LRU?

Once upon a time, my computer architecture professor mentioned that using a random eviction policy for caches really isn’t so bad. That random eviction isn’t bad can be surprising – if your cache fills up and you have to get rid of something, choosing the least recently used (LRU) is an obvious choice, since you’re more likely to use something if you’ve used it recently. If you have a tight loop, LRU is going to be perfect as long as the loop fits in cache, but it’s going to cause a miss every time if the loop doesn’t fit. A random eviction policy degrades gracefully as the loop gets too big.

In practice, on real workloads, random tends to do worse than other algorithms. But what if we take two random choices and just use LRU between those two choices?

http://danluu.com/2choices-eviction/