[PATCH] D152826: [AArch64] Improve shuffles of i1 vectors (WIP)

Thu Jun 29 03:49:38 PDT 2023

dmgreen added a comment.

Hello. Sorry for the delay. This seems to work OK for AArch64, but I'm not sure how much other architectures would like it. It might be worth possibly adding it to AArch64ISelLowering instead, as it is easier there to make assumptions about which types would be better.

I was trying it with the motivating example, and whilst it did help by reducing the instruction count by a lot, it wasn't enough for it to become profitable compared to scalar. There seems to just be too much shuffling going on. I am attempting to see if I can stop the vectorization in that case instead. We've seen this come up a few times though, so it would be good to get an improvement like this in if we can.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:24742
+  if (N0.getOpcode() == ISD::SETCC)
+    NewN = DAG.getSetCC(SDLoc(SV), N0.getOperand(0).getValueType(),
+                        N0.getOperand(0), N0.getOperand(1),
----------------
This creates a setcc with the result type set to the input type? What happens for targets which want to use a native vxi1 datatype, as they have native predicate registers?

There is a getSetCCResultType method in which the target specifies what the setcc result type should become. There is also a TargetLowering::BooleanContent that specifies whether the bits of the predicate from setcc should be 0/1 or 0/-1 or the top bits are undefined.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152826/new/

https://reviews.llvm.org/D152826