[PATCH] D56289: Added single use check to ShrinkDemandedConstant
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 7 13:14:36 PST 2019
rampitec added a comment.
In D56289#1348541 <https://reviews.llvm.org/D56289#1348541>, @rampitec wrote:
> In D56289#1348529 <https://reviews.llvm.org/D56289#1348529>, @rampitec wrote:
>
> > In D56289#1348172 <https://reviews.llvm.org/D56289#1348172>, @fhahn wrote:
> >
> > > I am seeing a regression with this change on AArch64, where we do not use shrinked constants, e.g. for and/or if they have multiple users. I am not entirely sure what the problem is this patch is solving, as I am not familiar with AMDGPU. Is it a case where shrinking the constant leads to shrinking the related `Op` and in the end this introduces unnecessary conversion code, as multiple users require different bitwidths?
> > >
> > > If that is the case, I am not sure if this is the best place to fix the issue. IIUC, shrinkDemandedBits just shrinks the constant of the passed in `Op`, but not the width of the result type of `Op`. So it seems to me that shrinking the constant should be independent of the users of `Op`.
> >
> >
> > Here is DAG exposing the problem:
> >
> > t33: i32 = or t42, Constant:i32<-2147483647>
> > t37: f32 = bitcast t33
> > t66: f32 = CVT_F32_UBYTE0 t33
> > t38: f32 = fadd t37, t66
> >
> >
> >
> > The node t66 only needs a low byte from the t33 "or" node. The other use of "or" is bitcast and subsequently t38 fadd which needs full dword.
> > The "Op" argument of ShrinkDemandedConstant() is t33 "or" and Demanded = 0xff as needed for t66.
> >
> > If optimization succeeds it will replace "or t42, 0x80000001" with "or t42, 1". However, this "or" has another use which needs the full range of the value.
> > That is a bug not to take other uses into account.
> >
> > It is unfortunate this lead to regression on AArch64 side. Are you sure a similar issue can never happen with AArch64?
> > Theoretically that is possible to move a single use check past targetShrinkDemandedConstant() call which actually do the job for AArch I presume.
> > I think it should correct lack of optimization in that specific case, but I am not sure there will be no correctness issues elsewhere. Here is what I am speaking about:
> >
> > --- a/lib/CodeGen/SelectionDAG/TargetLowering.cpp
> > +++ b/lib/CodeGen/SelectionDAG/TargetLowering.cpp
> > @@ -350,13 +350,13 @@ bool TargetLowering::ShrinkDemandedConstant(SDValue Op, const APInt &Demanded,
> > SDLoc DL(Op);
> > unsigned Opcode = Op.getOpcode();
> >
> > - if (!Op.hasOneUse())
> > - return false;
> > -
> > // Do target-specific constant optimization.
> > if (targetShrinkDemandedConstant(Op, Demanded, TLO))
> > return TLO.New.getNode();
> >
> > + if (!Op.hasOneUse())
> > + return false;
> > +
> > // FIXME: ISD::SELECT, ISD::SELECT_CC
> > switch (Opcode) {
> > default:
> >
> >
>
>
> In fact the bit width of all operations is the same. It just t66 only known to read low byte from its operand. I believe it worth nothing t66 is custom AMDGPU node.
> Consider t66 is a shift left by 24:
>
> t33: i32 = or t42, Constant:i32<-2147483647>
> t66: i32 = shl t33, 24
> t38: i32 = add t33, t66
>
>
> There is nothing target specific here. t66 also only known to read low 8 bits of t33 and that is not unreasonable to call ShrinkDemandedConstant on t33 with Demanded = 0xff. It will expose the same bug regardless of target.
After several hours trying to come up with a testcase which would fail with any other node or target I found none.
I have moved the check into the target: D56406 <https://reviews.llvm.org/D56406>. Maybe later a check will be needed somewhere in the common code.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D56289/new/
https://reviews.llvm.org/D56289
More information about the llvm-commits
mailing list