[PATCH] [Core] Update references in parallel
chisophugis at gmail.com
Thu Mar 19 18:33:42 PDT 2015
I really don't like this approach. It has large memory contention in the inner loop even though this computation doesn't inherently have any. Also it is using an extremely cache-unfriendly design, along with putting a `new` into the inner loop, causing contention inside the memory allocator besides the usual allocation slowness.
We could just store a bit inside the Atom that marks it as dead. That avoids a bunch of hash table lookups in the _deadAtoms map anyway. We have a ton of space in the Atom base class in the _definition field, which is only using 2 bits of a pointer-aligned field (aligned to the vptr). We already have the _definition field loaded (and in cache) due to the dyn_cast above (which checks _definition). So it's just an OR and a store to mark it: no extra loads, no extra cache misses, no synchronization.
More information about the llvm-commits