[llvm-dev] llvm combines "ADD frameindex, constant" to OR

Wed Mar 13 03:09:14 PDT 2019

Hi,

On Wed, 13 Mar 2019 at 00:25, Kazushi Marukawa via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>
> Hi all,
>
> I've been working on a backend of our architecture and noticed llvm performs
> following combining although one of operands is FrameIndex.
>
>     Combining: t114: i64 = add FrameIndex:i64<0>, Constant:i64<56>
>     Creating new node: t121: i64 = or FrameIndex:i64<0>, Constant:i64<56>
>      ... into: t121: i64 = or FrameIndex:i64<0>, Constant:i64<56>
>
> This caused problem if frame pointer points 0x60000038 at run time.
>
>
> I checked DAGCombiner::visitADD.  It folds ADD to OR by following code
> without considering about FrameIndex.  This haveNoCommonBitsSet says
> it's safe since FrameIndex(0) is 0.

This sounds like your fundamental problem to me. The generic code path
is computeKnownBits -> computeKnownBitsForFrameIndex ->
InferPtrAlignment. If your stack slot can legitimately end up at an
offset only aligned to 0x8 then that's going wrong somewhere and you
should probably work out why.

I don't really have the details to say for sure, but unless you've
overridden computeKnownBitsForFrameIndex a reasonably likely scenario
is that the alloca for that local variable requested 16-bit alignment
but your sp itself is only routinely aligned to 8 bytes (or less) and
you actually need to realign sp manually in XYZFrameLowering to
support that variable.

The ISD::OR thing can be annoying, but shouldn't affect correctness.

>         // Undo the add -> or combine to merge constant offsets from a frame index.
>         if (N0.getOpcode() == ISD::OR &&
>             isa<FrameIndexSDNode>(N0.getOperand(0)) &&
>             isa<ConstantSDNode>(N0.getOperand(1)) &&
>             DAG.haveNoCommonBitsSet(N0.getOperand(0), N0.getOperand(1))) {
>           SDValue Add0 = DAG.getNode(ISD::ADD, DL, VT, N1, N0.getOperand(1));
>           return DAG.getNode(ISD::ADD, DL, VT, N0.getOperand(0), Add0);
>         }
>
> I think this code may work fine on an architecture using FI + A + B.

It's target independent, and again shouldn't affect correctness. The
entire transformation is (FI | A) + B -> (FI + (A+B)). Useful when it
triggers, but not much else.

>   1. Is there any way to control this to avoid OR folding,
>      like TargetLowering::preferNotCombineToOrFrameIndex?
>      (or should we add new?)

There isn't as far as I know. I've heard it justified as
canonicalization before, and not been entirely convinced myself. But
it's generally pretty easy to cope with in the backend because
SelectionDAG::isBaseWithConstantOffset  deals with it correctly.

>   2. How to optimize FrameIndex?  I desire DAGCombiner use alignment
>      information on data, stack frame, or something.

As I said above, I think it should already use that and something has
gone wrong there in your case.

Cheers.

Tim.