[PATCH] D104049: [AMDGPU] [CodeGen] Fold negate llvm.amdgcn.class into test mask

Tue Jun 15 10:32:09 PDT 2021

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:23
 #include "llvm/Analysis/ValueTracking.h"
+#include "llvm/CodeGen/GlobalISel/InstructionSelector.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
----------------
Definitely should not include this here

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:831-839
+    if (isa<CallInst>(LHS) && isa<ConstantInt>(RHS) &&
+        cast<ConstantInt>(RHS)->getSExtValue() == -1)
+      Call = cast<CallInst>(LHS);
+    else if (isa<CallInst>(RHS) && isa<ConstantInt>(LHS) &&
+             cast<ConstantInt>(LHS)->getSExtValue() == -1)
+      Call = cast<CallInst>(RHS);
+    else {
----------------
I believe PatternMatch has a nicer way to check for a not

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:847
+      IntrinsicCall->getIntrinsicID() != Intrinsic::amdgcn_class ||
+      !isa<Constant>(IntrinsicCall->getOperand(1)))
+    return false;
----------------
I'd prefer to just use the dyn_cast below and return on that rather than checking isa first

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:854
+  int64_t BitWidth = Arg->getBitWidth();
+  APInt NewArg(BitWidth, ~Arg->getZExtValue());
+
----------------
You don't need this intermediate APInt (plus this is required to be an i32)

================
Comment at: llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-foldnegate.ll:54
+  %2 = xor i1 %1, -1
+  %3 = xor i1 %2, -1
+  %4 = xor i1 %2, -1
----------------
It would be a better test to have a non-identical use (i.e. just add a store of the value)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104049/new/

https://reviews.llvm.org/D104049