[PATCH] D28909: [InstCombineCalls] Unfold element atomic memcpy instruction
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 23 20:05:13 PST 2017
reames requested changes to this revision.
reames added inline comments.
This revision now requires changes to proceed.
================
Comment at: include/llvm/IR/IntrinsicInst.h:155
+ /// This class handles atomic memcpy intrinsic
+ /// TODO: Integrate this class into MemIntrinsic hierarchy.
----------------
minor: instead of "class handles" use "represents"
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:63
+static cl::opt<uint64_t> UnfoldElementAtomicMemcpyThreshold(
+ "element-atomic-memcpy-threshold",
----------------
Units? The name should ideally reflect.
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:119
+/// Return true if changes were made and false otherwise.
+/// All newly inserted instructions will be added to the worklist.
+bool
----------------
Please use the same convention used for the normal memcpy. Lower the atomic memcopy, set the length of the original to zero, then remove the zero length one. This also requires you to implement a currently missed optimization for zero length copies.
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:131
+ uint64_t NumElements = NumElementsCI->getZExtValue();
+ if (NumElements >= UnfoldElementAtomicMemcpyThreshold)
+ return false;
----------------
Why 16 elements as opposed to 8 bytes? (i.e. what the non-atomic form uses)
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:167
+ Load->setOrdering(AtomicOrdering::Unordered);
+ Load->setAlignment(AMI->getSrcAlignment());
+ Load->setDebugLoc(AMI->getDebugLoc());
----------------
The source alignment may not hold for all elements. i.e. your alignment might be 1024 with your element size being 8.
https://reviews.llvm.org/D28909
More information about the llvm-commits
mailing list