[PATCH] D138904: [AArch64] Transform shift+and to shift+shift to select more shifted register
Mingming Liu via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 29 17:37:23 PST 2022
mingmingl added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:632
+
+ unsigned ShiftAmtC = ShiftAmtNode->getZExtValue();
+ SDValue RHS = N.getOperand(1);
----------------
nit: use `uint64_t` (the same type as the return value of `getZExtValue()` to avoid warnings.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:647-671
+ if (LHSOpcode == ISD::SHL) {
+ if (LowZBits <= ShiftAmtC || (BitWidth != LowZBits + MaskLen))
+ return false;
+
+ NewShiftC = LowZBits - ShiftAmtC;
+ NewShiftOp = VT == MVT::i64 ? AArch64::UBFMXri : AArch64::UBFMWri;
+ } else {
----------------
nit pick:
To ensure that UBFM/SBFM is used as a right shift (essentially [[ https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/UBFX--Unsigned-Bitfield-Extract--an-alias-of-UBFM-?lang=en | UBFX ]]) here, add `assert(BitWidth-1>=NewShiftC && "error message")`
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:654-655
+ } else {
+ if (LowZBits == 0)
+ return false;
+
----------------
Relatedly, with 1) N as `and (srl x, c), mask` and 2) LowZBits == 0, `N` should be selected as UBFX by [[ https://github.com/llvm/llvm-project/blob/6a91a5e82647f206a31706586d201ecc638e9365/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp#L2174 | isBitfieldExtractOp ]] //before// tablegen pattern (where `AArch64DAGToDAGISel::SelectShiftedRegister` gets called) applies. Similarly, with `and(shl x, c), mask` and LowZBits == 0, `N` is a UBFIZ.
So I think this shouldn't happen be seen in practice; but an early exit (not assertion) looks okay.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:657-658
+
+ if (LHSOpcode == ISD::SRA && (BitWidth != LowZBits + MaskLen))
+ return false;
+
----------------
For correctness, similar check is needed when LHSOpcode is ISD::SRL; meanwhile for `srl`, the condition `BitWidth == LowZBits + MaskLen + ShiftAmtC` should be sufficient
For example, https://gcc.godbolt.org/z/YvGG3Pov3 shows an IR at trunk; and with the current patch the optimized code changes the result (not correct).
```
lsr x8, x0, #3
add x0, x1, x8, lsl #1
ret
```
================
Comment at: llvm/test/CodeGen/AArch64/shiftregister-from-and.ll:137-139
+; negative test: type is not i32 or i64
+
+define i16 @shiftedreg_from_and_negative_type(i16 %a, i16 %b) {
----------------
nit pick:
i16 will be legalized to i32 [1], and the patch optimizes `shiftedreg_from_and_negative_type` (trunk https://gcc.godbolt.org/z/cnac5989e). This doesn't seem to be a negative test . Probably use a vector like https://gcc.godbolt.org/z/T4szTsEav
[1] `llc -O3 -mtriple=aarch64 -debug -print-after-all test.ll` gives debug logs, and https://gist.github.com/minglotus/763173f099bec76c720ec9ce7c181471 shows the legalization step.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138904/new/
https://reviews.llvm.org/D138904
More information about the llvm-commits
mailing list