[PATCH] D104049: [AMDGPU] [CodeGen] Fold negate llvm.amdgcn.class into test mask

Thu Jun 17 01:38:34 PDT 2021

foad added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:23
 #include "llvm/IR/Dominators.h"
+#include "llvm/IR/IRBuilder.h"
 #include "llvm/IR/InstVisitor.h"
----------------
Don't need this because you don't use the IRBuilder for anything

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:26
 #include "llvm/IR/IntrinsicsAMDGPU.h"
-#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/PatternMatch.h"
 #include "llvm/InitializePasses.h"
----------------
Don't need this unless you actually do some pattern matching

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:29
 #include "llvm/Pass.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/KnownBits.h"
----------------
Don't need this, it's already included by default

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:828
+  // "Not" the second argument of the intrinsic call
+  IRBuilder<> Builder(IntrinsicCall);
+  ConstantInt *Arg = dyn_cast<ConstantInt>(IntrinsicCall->getOperand(1));
----------------
Don't need this, you don't use it for anything

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:832-834
+  int Mask = (1ull << (64 - __builtin_clz(Arg->getZExtValue()))) - 1;
+  IntrinsicCall->setOperand(
+      1, ConstantInt::get(Arg->getType(), ~Arg->getZExtValue() & Mask));
----------------
This is wrong. If the value was 4 you will xor it with 7 giving 3, but you need to flip all the bits that amdgcn_class cares about, i.e. 10 low order bits. You should either xor with a fixed value of 0x3ff, or perhaps move this enum from AMDGPUInstCombineIntrinsic.cpp to a common header (maybe SIDefines.h?) and add an "ALL" value to it:
```
260:  case Intrinsic::amdgcn_class: {
261-    enum {
262-      S_NAN = 1 << 0,       // Signaling NaN
263-      Q_NAN = 1 << 1,       // Quiet NaN
264-      N_INFINITY = 1 << 2,  // Negative infinity
265-      N_NORMAL = 1 << 3,    // Negative normal
266-      N_SUBNORMAL = 1 << 4, // Negative subnormal
267-      N_ZERO = 1 << 5,      // Negative zero
268-      P_ZERO = 1 << 6,      // Positive zero
269-      P_SUBNORMAL = 1 << 7, // Positive subnormal
270-      P_NORMAL = 1 << 8,    // Positive normal
271-      P_INFINITY = 1 << 9   // Positive infinity
272-    };
```

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104049/new/

https://reviews.llvm.org/D104049