[PATCH] D11220: Correct lowering of memmove in NVPTX
Eli Bendersky
eliben at google.com
Wed Jul 15 12:33:10 PDT 2015
eliben marked 4 inline comments as done.
eliben added a comment.
In http://reviews.llvm.org/D11220#205531, @okwank wrote:
> Hi,
>
> My name is Okwan Kwon, and I have two comments.
>
> 1. Use getRawDest() and getRawSource() instead. getDest() and getSource() will strip off any casts to give the original pointer. When the original pointer type is not i8 *, then it will get an assertion.
Thanks, done. Also added test with casts.
> 2. Can you find a way to make use of the alignment information from the memmove intrinsic? It might be possible to generate more efficient load/store than byte copying.
There's a TODO in the code now about this. I'll keep it as a TODO for now
================
Comment at: lib/Target/NVPTX/NVPTXLowerAggrCopies.cpp:125
@@ +124,3 @@
+// }
+void convertMemMoveToLoop(Instruction *splitAt, Value *srcAddr, Value *dstAddr,
+ Value *len, bool srcVolatile, bool dstVolatile,
----------------
jingyue wrote:
> Argument names start with upper case.
True. For now I'm keeping consistent style with other functions in this file - which isn't great in terms of LLVM style. I will do a later refactoring to bring everything closer to the LLVM style
================
Comment at: lib/Target/NVPTX/NVPTXLowerAggrCopies.cpp:168
@@ +167,3 @@
+ Value *Element = LoopBuilder.CreateLoad(
+ LoopBuilder.CreateInBoundsGEP(srcAddr, {IndexPtr}), "element");
+ LoopBuilder.CreateStore(Element,
----------------
jingyue wrote:
> CreateInBoundsGEP(srcAddr, IndexPtr) should work now.
Nice!
================
Comment at: lib/Target/NVPTX/NVPTXTargetMachine.cpp:69
@@ -67,2 +68,3 @@
+ PassRegistry &PR = *PassRegistry::getPassRegistry();
initializeNVVMReflectPass(*PassRegistry::getPassRegistry());
initializeGenericToNVVMPass(*PassRegistry::getPassRegistry());
----------------
jingyue wrote:
> Do you intend to use `PR` instead?
Good catch, thanks.
================
Comment at: test/CodeGen/NVPTX/lower-aggr-copies.ll:21
@@ +20,3 @@
+; IR: loadstoreloop:
+; IR: [[LOADPTR:%[0-9]+]] = getelementptr i8, i8* %src, i64
+; IR-NEXT: [[VAL:%[0-9]+]] = load i8, i8* [[LOADPTR]]
----------------
jingyue wrote:
> How come we don't see inbounds here? You created them using `CreateInBoundsGEP`.
So no, I didn't. This is still memcpy, not memmove. It didn't use inbounds. I will refactor it to use it in the future, because it makes sense here I think
Repository:
rL LLVM
http://reviews.llvm.org/D11220
More information about the llvm-commits
mailing list