[PATCH] D62100: [DAGCombine][X86][AMDGPU][AArch64] (srl (shl x, c1), c2) with c1 != c2 handling
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat May 18 15:18:17 PDT 2019
lebedev.ri added subscribers: arsenm, compnerd.
lebedev.ri marked 2 inline comments as done.
lebedev.ri added a comment.
Looked at changes:
- I'll leave x86 vector stuff for later. since i actually wanted to look into reverse trasnform, and looked into this only for consistency.
- I don't know what to do with AArch64 regression. I can hide it with `shouldFoldConstantShiftPairToMask()`, but it is there regardless (tests added). Thoughts?
- That leaves AMDGPU?
================
Comment at: test/CodeGen/AArch64/arm64-bitfield-extract.ll:396
; LLC-NEXT: ldr w8, [x0]
+; LLC-NEXT: and w8, w8, #0x3ffffff8
; LLC-NEXT: bfxil w8, w1, #16, #3
----------------
After actually looking at `-debug` output, this regression happens because of `SimplifyDemandedBits()`,
which ignores `AArch64TargetLowering::isDesirableToCommuteWithShift()` override.
So when we get to `AArch64TargetLowering::isBitfieldExtractOp()`, we have
```
t18: i32 = srl t15, Constant:i64<2>
t15: i32 = or t26, t24
t26: i32 = and t7, Constant:i32<1073741816>
t7: i32,ch = load<(load 4 from %ir.y, align 8)> t0, t2, undef:i64
t2: i64,ch = CopyFromReg t0, Register:i64 %0
t1: i64 = Register %0
t6: i64 = undef
t25: i32 = Constant<1073741816>
t24: i32 = and t12, Constant:i32<4>
t12: i32 = srl t4, Constant:i64<16>
t4: i32,ch = CopyFromReg t0, Register:i32 %1
t3: i32 = Register %1
t11: i64 = Constant<16>
t23: i32 = Constant<4>
t17: i64 = Constant<2>
```
I'm not sure how to turn this pattern on it's head to produce `ubfx` again.
================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.ubfe.ll:686-687
; SI-NEXT: s_waitcnt vmcnt(0)
-; SI-NEXT: v_lshlrev_b32_e32 v0, 31, v0
-; SI-NEXT: v_lshrrev_b32_e32 v0, 1, v0
+; SI-NEXT: v_lshlrev_b32_e32 v0, 30, v0
+; SI-NEXT: v_and_b32_e32 v0, 2.0, v0
; SI-NEXT: buffer_store_dword v0, off, s[4:7], 0
----------------
@arsenm will AMDGPU prefer 2 shifts or shift+mask here?
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D62100/new/
https://reviews.llvm.org/D62100
More information about the llvm-commits
mailing list