[PATCH] D28909: [InstCombineCalls] Unfold element atomic memcpy instruction

Tue Jan 24 09:25:08 PST 2017

igor-laevsky added inline comments.

================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:131
+  uint64_t NumElements = NumElementsCI->getZExtValue();
+  if (NumElements >= UnfoldElementAtomicMemcpyThreshold)
+    return false;
----------------
reames wrote:
> Why 16 elements as opposed to 8 bytes?  (i.e. what the non-atomic form uses)
This transformation is not exactly similar to the non-atomic unfolding. Non atomic version only unfolds into single load/store pair. Here we can generate more operations. Number 16 means that no more than 16 load/store pairs will be generated. I picked it as an arbitrary small integer but we can easily tune it afterwards.

================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:167
+    Load->setOrdering(AtomicOrdering::Unordered);
+    Load->setAlignment(AMI->getSrcAlignment());
+    Load->setDebugLoc(AMI->getDebugLoc());
----------------
reames wrote:
> The source alignment may not hold for all elements.  i.e. your alignment might be 1024 with your element size being 8.
Yes, thanks for the catch!

https://reviews.llvm.org/D28909