[llvm] [AArch64] Generate rev16 for certain uses of __builtin_bswap16 (PR #105375)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 5 12:44:54 PDT 2024
================
@@ -22212,6 +22212,26 @@ static SDValue performExtendCombine(SDNode *N,
N->getOperand(0)->getOpcode() == ISD::SETCC)
return performSignExtendSetCCCombine(N, DCI, DAG);
+ // If we see (any_extend (bswap ...)) with bswap returning an i16, we know
+ // that the top half of the result register must be unused, due to the
+ // any_extend. This means that we can replace this pattern with (rev16
+ // (any_extend ...)). This saves a machine instruction compared to (lsr (rev
+ // ...)), which is what this pattern would otherwise be lowered to.
+ // Only apply this optimisation if any_extend in original pattern to i32 or
+ // i64, because this type will become the input type to REV16 in the new
+ // pattern, so must be a legitimate REV16 input type.
+ if (N->getOpcode() == ISD::ANY_EXTEND &&
+ N->getOperand(0).getOpcode() == ISD::BSWAP &&
+ N->getOperand(0).getValueType() == MVT::i16 &&
+ (N->getValueType(0) == MVT::i32 || N->getValueType(0) == MVT::i64)) {
+ SDNode *BswapNode = N->getOperand(0).getNode();
----------------
davemgreen wrote:
`SDValue BSwap = N->getOperand(0);`, then it can use `BSwap.getOperand(0)` below.
https://github.com/llvm/llvm-project/pull/105375
More information about the llvm-commits
mailing list