thoughts…

rants and bookmarks about programming stuff…


Common Pitfalls in Writing Lock-Free Algorithms

“Formally, a multi-threaded algorithm is considered to be lock-free if there is an upper bound on the total number of steps it must perform between successive completions of operations.
The statement is simple, but its implications are deep – at every stage, a lock-free algorithm guarantees forward progress in some finite number of operations. Deadlock is impossible…”

http://developers.memsql.com/blog/common-pitfalls-in-writing-lock-free-algorithms/

 


This Hash Table Is Faster Than a Judy Array

“In game development, we use associative maps for many things. Dynamic loading, object reflection, rendering, shader management.

A lot of them fit the following usage pattern:

  • The keys and values have simple integer or pointer types.
  • The keys are effectively random.
  • The basic operation is lookup-or-insert: Look for an existing item first, and add one if not found. Usually, the item is found.
  • Deletes happen infrequently, and when they do, they tend to happen in bulk…”

http://preshing.com/20130107/this-hash-table-is-faster-than-a-judy-array


Five Myths about Hash Tables

“A hash table is data structure that is used to search for exact matches to a search key. For searching, they work like this:

  1. Take a search key (example: the word “cat”)
  2. Hash the search key: pass the key to a function that returns an integer value (example: the hash function returns 47 when the input key is “cat”)
  3. Use the integer value to inspect a slot to see if it contains the search key (example: look in an array at element 47)
  4. If the key matches, great, you’ve found what you’re looking for and you can retrieve whatever metadata you’re looking for (example: in array element 47, we’re found the word “cat”. We retrieve metadata that tells us “cat” is an “animal”)
  5. If the key doesn’t match and the slot contains something besides the key, carry out a secondary search process to make sure the search key really isn’t stored in the hash table; the secondary search process varies depending on the type of hashing algorithm used (example: in array element 47, we found the word “dog”. So, let’s look in slot 48 and see if “cat” is there. Slot 48 is empty, so “cat” isn’t in the hash table. This is called linear probing, it’s one kind of secondary search)…”

 

http://hughewilliams.com/2012/10/01/five-myths-about-hash-tables/


Leave a comment

Engineering a List Merge Sort

“Back in November 2011, Takeru Ohta submitted a very nice patch to replace our (SBCL’s) in-place stable merge sort on linked lists with a simpler, much more efficient implementation. It took me until last May to whip myself into running a bunch of tests to estimate the performance improvements and make sure there weren’t any serious regression, and finally commit the patch. This post summarises what happened as I tried to find further improvements. The result is an implementation that’s linear-time on nearly sorted or reverse-sorted lists, around 4 times as fast on slightly shuffled lists, and up to 30% faster on completely shuffled lists, thanks to design choices guided by statistically significant effects on performance (… on one computer, my dual 2.8 GHz X5660)…”

http://www.pvk.ca/Blog/2012/08/13/engineering-a-list-merge-sort/


Leave a comment

Binary Search Is a Pathological Case for Caches

“Programmers tend to like round numbers, i.e. powers of two. So do hardware designers. Sadly, this shared value doesn’t always work to our advantage. One common issue is that of cache line aliasing induced by alignment.

Binary search suffers from a related ailment when executed on medium or large vectors of almost power-of-two size (in bytes), but it can be cured. Once that is done, searching a sorted vector can be as fast as searches with a well-tuned hash table, for a few realistic access patterns.

The task is interesting to me because I regularly work with static, or almost static, sets: sets for which there’s a majority of lookups, while updates are either rare or batchable. For such sets, the improved performance of explicit balanced search trees on insertions is rarely worth the slowdown on lookups, nor the additional space usage. Replacing binary search with slightly off-center binary or quaternary (four-way) searches only adds a bit more code to provide even quicker, more consistent lookup times…”

http://www.pvk.ca/Blog/2012/07/30/binary-search-is-a-pathological-case-for-caches/


Leave a comment

Dissecting SSL handshake

“Not everyone knows that the SSL handshake is not encrypted. When you think about it – there isn’t other way, before the keys are exchanged the communication must be unencrypted. But I doubt many people think about it.

Not only the SSL handshake is plain-text, but also it contains rather interesting data. I decided to find out how much information can be retrieved from it…”

https://idea.popcount.org/2012-06-16-dissecting-ssl-handshake/


Leave a comment

Plain English explanation of Big O

“The simplest definition I can give for Big-O notation is this:

Big-O notation is a relative representation of the complexity of an algorithm.

There are some important and deliberately chosen words in that sentence:

  • relative: you can only compare apples to apples. You can’t compare an algorithm to do arithmetic multiplication to an algorithm that sorts a list of integers. But two algorithms that do arithmetic operations (one multiplication, one addition) will tell you something meaningful;
  • representation: Big-O (in its simplest form) reduces the comparison between algorithms to a single variable. That variable is chosen based on observations or assumptions. For example, sorting algorithms are typically compared based on comparison operations (comparing two nodes to determine their relative ordering). This assumes that comparison is expensive. But what if comparison is cheap but swapping is expensive? It changes the comparison; and
  • complexity: if it takes me one second to sort 10,000 elements how long will it take me to sort one million? Complexity in this instance is a relative measure to something else.

Come back and reread the above when you’ve read the rest…”

http://stackoverflow.com/a/487278/379580


Leave a comment

An Introduction to Lock-Free Programming

“Lock-free programming is a challenge, not just because of the complexity of the task itself, but because of how difficult it can be to penetrate the subject in the first place.

I was fortunate in that my first introduction to lock-free (also known as lockless) programming was Bruce Dawson’s excellent and comprehensive white paper, Lockless Programming Considerations. And like many, I’ve had the occasion to put Bruce’s advice into practice developing and debugging lock-free code on the Xbox 360…”

http://preshing.com/20120612/an-introduction-to-lock-free-programming


Leave a comment

Why you should never use hash functions for message authentication

“All the information in this post I learned from Stanford’s introduction to cryptography, which is an excellent six-week primer on the topic. It’s a little mathsy but not nearly as much as it could be, meaning it’s fairly accessible and focuses on practical problems and intuition, and is really useful to anyone handling user data for a living. I’d go so far as to say it should be required reading for any web developer. It starts up again on Monday June 11: go sign up and make sure you’re not the next victim of the password leaks we’ve seen this week…”

blog.jcoglan.com/2012/06/09/why-you-should-never-use-hash-functions-for-message-authentication/

Follow

Get every new post delivered to your Inbox.

Join 511 other followers