Cassandra anti-patterns: Queues and queue-like datasets

Deletes in Cassandra

Cassandra uses a log-structured storage engine. Because of this, deletes do not remove the rows and columns immediately and in-place. Instead, Cassandra writes a special marker, called a tombstone, indicating that a row, column, or range of columns was deleted. These tombstones are kept for at least the period of time defined by the gc_grace_seconds per-table setting. Only then a tombstone can be permanently discarded by compaction.

This scheme allows for very fast deletes (and writes in general), but it’s not free: aside from the obvious RAM/disk overhead of tombstones, you might have to pay a certain price when reading data back if you haven’t modelled your data well.

Specifically, tombstones will bite you if you do lots of deletes (especially column-level deletes) and later perform slice queries on rows with a lot of tombstones.

Symptoms of a wrong data model

To illustrate this scenario, let’s consider the most extreme case – using Cassandra as a durable queue, a known anti-pattern, e.g.

CREATE TABLE queues ( name text, enqueued_at timeuuid, payload blob, PRIMARY KEY (name, enqueued_at) );

Having enqueued 10000 10-byte messages and then dequeued 9999 of them, one by one, let’s peek at the last remaining message using cqlsh with TRACING ON:

SELECT enqueued_at, payload FROM queues WHERE name = 'queue-1' LIMIT 1;