[PATCH] D88783: [InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 8 06:39:58 PDT 2020
- Previous message: [PATCH] D88783: [InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw
- Next message: [PATCH] D88783: [InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
RKSimon added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:2090-2091
+ if (match(R, m_OneUse(m_Sub(m_SpecificInt(Width), m_Specific(L))))) {
+ KnownBits KnownL = IC.computeKnownBits(L, /*Depth*/ 0, &Or);
+ return KnownL.getMaxValue().ult(Width) ? L : nullptr;
+ }
----------------
lebedev.ri wrote:
> spatel wrote:
> > lebedev.ri wrote:
> > > Either this needs a motivational comment, or this can be dropped.
> > > ```
> > > ----------------------------------------
> > > define i64 @src(i64 %x, i64 %y, i64 %a) {
> > > %0:
> > > %mask = and i64 %a, -1
> > > %shl = shl i64 %x, %mask
> > > %sub = sub nsw nuw i64 64, %mask
> > > %shr = lshr i64 %y, %sub
> > > %r = or i64 %shl, %shr
> > > ret i64 %r
> > > }
> > > =>
> > > define i64 @tgt(i64 %x, i64 %y, i64 %a) {
> > > %0:
> > > %r = fshl i64 %x, i64 %y, i64 %a
> > > ret i64 %r
> > > }
> > > Transformation seems to be correct!
> > >
> > > ```
> > We can correctly convert to the intrinsic because the intrinsic makes guarantees about the shift amount (modulo bitwidth) that may not be present in the original IR.
> >
> > But without proper masking/select for UB shift amounts, the expanded code will be worse in codegen. So this is an irreversible transform, and we can't do it universally.
> >
> > So even with the computeKnownBits restriction in this patch, this transform may not be allowed. For example, can we fix this in codegen:
> > https://alive2.llvm.org/ce/z/IsuRoN
> >
> > ```
> > $ cat fsh.ll
> > define i32 @fake_fshl(i32 %x, i32 %y, i32 %a) {
> > %mask = and i32 %a, 31
> > %shl = shl i32 %x, %mask
> > %sub = sub nuw nsw i32 32, %mask
> > %shr = lshr i32 %y, %sub
> > %r = or i32 %shl, %shr
> > ret i32 %r
> > }
> >
> > define i32 @fshl(i32 %x, i32 %y, i32 %a) {
> > %r = call i32 @llvm.fshl.i32(i32 %x, i32 %y, i32 %a)
> > ret i32 %r
> > }
> >
> > declare i32 @llvm.fshl.i32(i32, i32, i32)
> >
> > $ llc -o - fsh.ll -mtriple=riscv64
> > fake_fshl: # @fake_fshl
> > sllw a0, a0, a2
> > addi a3, zero, 32
> > sub a2, a3, a2
> > srlw a1, a1, a2
> > or a0, a0, a1
> > ret
> > fshl: # @fshl
> > andi a2, a2, 31
> > sll a0, a0, a2
> > not a2, a2
> > slli a1, a1, 32
> > srli a1, a1, 1
> > srl a1, a1, a2
> > or a0, a0, a1
> > ret
> >
> > ```
> > We can correctly convert to the intrinsic because the intrinsic makes guarantees about the shift amount (modulo bitwidth) that may not be present in the original IR.
>
> Yep, the intrinsic has an implicit masking, as we can see by said masking being dropped after the transform.
>
> > But without proper masking/select for UB shift amounts, the expanded code will be worse in codegen. So this is an irreversible transform, and we can't do it universally.
>
> Yep, that is why i'm asking for more blurb.
> The problematic case in mind is, what we were able to proove that the value is limited because we had range info on a load?
At least part of the riscv64 issue is due to it deciding to promote the fshl to i64 but not the or+shifts pattern (iirc the sllw/srlw variants are i32 shifts with mod32 that sext the results to i64).
```
SelectionDAG has 15 nodes:
t0: ch = EntryToken
t2: i64,ch = CopyFromReg t0, Register:i64 %0
t4: i64,ch = CopyFromReg t0, Register:i64 %1
t16: i64 = shl t4, Constant:i64<32>
t6: i64,ch = CopyFromReg t0, Register:i64 %2
t21: i64 = and t6, Constant:i64<31>
t18: i64 = fshl t2, t16, t21
t13: ch,glue = CopyToReg t0, Register:i64 $x10, t18
t14: ch = RISCVISD::RET_FLAG t13, Register:i64 $x10, t13:1
```
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D88783/new/
https://reviews.llvm.org/D88783
- Previous message: [PATCH] D88783: [InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw
- Next message: [PATCH] D88783: [InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
More information about the llvm-commits
mailing list