[PATCH] D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 24 01:12:11 PDT 2019


nhaehnle added a comment.

Thanks for doing this. For the codegen quality question, I wonder if something like the following could be done:

1. Remove what's currently in SITargetLowering.
2. Add a custom DAG combine which combines `(ballot (ISD::SETCC ...))` to `(AMDGPUISD::SETCC ...)`
3. Do code generation for the remaining ballot cases either in AMDGPUISelDAGToDAG or via a TableGen pattern (the most generic case would have to map to some combination of COPY_TO_REGCLASS and S_AND_B32/B64 <https://reviews.llvm.org/B64>).



================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:4181-4182
+    if (ConstSrc->isNullValue())
+      // Optimization: ballot(0) returns all zeroes.
+      return DAG.getConstant(0, DL, VT);
+    if (ConstSrc->isOne()) {
----------------
foad wrote:
> arsenm wrote:
> > Should also do this in instcombine, as icmp/fcmp already do
> I've had a go, but is it really necessary to implement the same optimization cases in two places (especially the one that just replaces a ballot intrinsic with a read_register intrinsic), and how do I test them independently?
Independent tests as opt tests should go into test/Transforms/InstCombine. You can find existing tests for the icmp/fcmp intrinsics there.

Nitpick: please add braces for the if-body (multiple lines guarded by if).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65088/new/

https://reviews.llvm.org/D65088





More information about the llvm-commits mailing list