Cuckoo filters and their analysis

Michael Mitzenmacher has described cuckoo filters in an earlier blog post (as well as of course in the published paper about them) but the basic idea is to use a cuckoo hash table cut down in size by storing only a short fingerprint of each key rather than a whole key-value pair. As in a normal cuckoo hash table, keys (or rather their fingerprints) get moved around to make room for other keys, and that leads to a small complication: when you’re moving a fingerprint, you don’t know which key it came from, so the location to move it to needs to be computable based only on where it is now and on its value. More specifically, the other location for any fingerprint ends up being the xor of its current location with a hash of its value.

Although cuckoo filters have been implemented (see first link) and work well in practice, one drawback is that we didn’t know whether they also work well in theory. Conversely, an earlier data structure of Pagh, Pagh, and Rao (SODA 2005) has all the same advantages of cuckoo filters over Bloom filters, but for it, as far as I am aware, there was no implementation, only theory. In contrast, Bloom filters work both actually and theoretically: there is no major gap between theory and practice.