[PATCH] D104049: [AMDGPU] [CodeGen] Fold negate llvm.amdgcn.class into test mask

Wed Jun 16 02:52:10 PDT 2021

foad added a comment.

Please run all of check-llvm-codegen-amdgpu. I tried your patch and it looks like a couple more tests need updating.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:17-39
+#include "llvm-c/Core.h"
+#include "llvm/ADT/SmallVector.h"
 #include "llvm/Analysis/AssumptionCache.h"
 #include "llvm/Analysis/ConstantFolding.h"
 #include "llvm/Analysis/LegacyDivergenceAnalysis.h"
 #include "llvm/Analysis/ValueTracking.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
----------------
I don't think you need //any// of these changes to the #includes.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:825
+  IntrinsicInst *IntrinsicCall = nullptr;
+  if (!I.hasOneUse())
+    return false;
----------------
This check is wrong. It's OK for there to be multiple uses of the Xor, since you replace all of them. It is not OK for there to be other uses of the intrinsic call, since you modify it in-place.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104049/new/

https://reviews.llvm.org/D104049