The hazard pointers implementation was bit of frivolous with memory
usage allocating memory based on maximum constants rather than on the
usage.
Make the retired list bit use exactly the memory needed for specified
number of hazard pointers. This reduced the memory used by hazard
pointers to one quarter in our specific case because we only use single
HP in the queue implementation (as opposed to allocating memory for
HP_MAX_HPS = 4).
Previously, the alignment to prevent false sharing was double the
cacheline size. This was copied from the ConcurrencyFreaks
implementation, but one cacheline size is enough to prevent false
sharing, so we are using this now to save few bits of memory.
The top level hazard pointers and retired list arrays are now not
aligned to the cacheline size - they are read-only for the whole
life-time of the isc_hp object. Only hp (hazard pointer) and
rl (retired list) array members are allocated aligned to the cacheline
size to avoid false sharing between threads.
Cleanup HP_MAX_HPS and HP_THRESHOLD_R constants from the paper, because
we don't use them in the code. HP_THRESHOLD_R was 0, so the check
whether the retired list size was smaller than the value was basically a
dead code.