[PATCH] D28909: [InstCombineCalls] Unfold element atomic memcpy instruction
Igor Laevsky via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 24 09:25:08 PST 2017
igor-laevsky added inline comments.
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:131
+ uint64_t NumElements = NumElementsCI->getZExtValue();
+ if (NumElements >= UnfoldElementAtomicMemcpyThreshold)
+ return false;
----------------
reames wrote:
> Why 16 elements as opposed to 8 bytes? (i.e. what the non-atomic form uses)
This transformation is not exactly similar to the non-atomic unfolding. Non atomic version only unfolds into single load/store pair. Here we can generate more operations. Number 16 means that no more than 16 load/store pairs will be generated. I picked it as an arbitrary small integer but we can easily tune it afterwards.
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:167
+ Load->setOrdering(AtomicOrdering::Unordered);
+ Load->setAlignment(AMI->getSrcAlignment());
+ Load->setDebugLoc(AMI->getDebugLoc());
----------------
reames wrote:
> The source alignment may not hold for all elements. i.e. your alignment might be 1024 with your element size being 8.
Yes, thanks for the catch!
https://reviews.llvm.org/D28909
More information about the llvm-commits
mailing list