You Can’t Always Hash Pointers in C

Occasionally I’ve needed to key a hash table with C pointers. I don’t care about the contents of the object itself — especially if it might change — just its pointer identity. For example, suppose I’m using null-terminated strings as keys and I know these strings will always be interned in a common table. These strings can be compared directly by their pointer values (str_a == str_b) rather than, more slowly, by their contents (strcmp(str_a, str_b) == 0). The intern table ensures that these expressions both have the same result.

As a key in a hash table, or other efficient map/dictionary data structure, I’ll need to turn pointers into numerical values. However, C pointers aren’t integers. Following certain rules it’s permitted to cast pointers to integers and back, but doing so will reduce the program’s portability. The most important consideration is that the integer form isn’t guaranteed to have any meaningful or stable value. In other words, even in a conforming implementation, the same pointer might cast to two different integer values. This would break any algorithm that isn’t keenly aware of the implementation details.

http://nullprogram.com/blog/2016/05/30/

Advertisements