[PATCH] D38191: [NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Sep 25 11:30:54 PDT 2017


tra added inline comments.


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:9603
+    Value *Pred = Builder.CreateSExt(Builder.CreateExtractValue(ResultPair, 1),
+                                     PredOutPtr.getElementType());
+    Builder.CreateStore(Pred, PredOutPtr);
----------------
jlebar wrote:
> Doing sext i1 -> i32 is going to cause us to store 0 or -1 in the pred (right?).  The CUDA docs say
> 
> > Predicate pred is set to true if all threads in mask have the same value of value; otherwise the predicate is set to false.
> 
> I'd guess that "true" probably means 1 (i.e. uext i1 -> i32) rather than -1, although, I guess we have to check.
Right. It should've been ZExt. In similar places CUDA headers use "selp %r1, 1, 0, %p".


https://reviews.llvm.org/D38191





More information about the cfe-commits mailing list