[llvm-commits] [PATCH 2/2] Convert VLDR/VSTR on demand for base update optimization

David Peixotto dpeixott at codeaurora.org
Wed Nov 7 17:36:26 PST 2012


Jim and Andy,

Thanks for all your comments. Sorry for such a delayed response, but it took
a while until I could find an example to help move the discussion forward.
I've put my feedback at the end so we can retain the context and all
remember what we were talking about :)

I'd still like to get these patches merged. I'm going to improve the current
patches based on Jim's earlier feedback and our discussion. I will resubmit
a new patch when we have figured out some of the details.

See below for more details.

> -----Original Message-----
> From: Jim Grosbach [mailto:grosbach at apple.com]
> Sent: Tuesday, September 18, 2012 2:15 PM
> To: Andrew Trick
> Cc: David Peixotto; llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] [PATCH 2/2] Convert VLDR/VSTR on demand for
> base update optimization
> 
> 
> On Sep 18, 2012, at 1:39 PM, Andrew Trick wrote:
> 
> >
> > On Sep 18, 2012, at 11:33 AM, Jim Grosbach <grosbach at apple.com> wrote:
> >
> >>
> >> On Sep 18, 2012, at 11:09 AM, Andrew Trick wrote:
> >>
> >>> On Sep 18, 2012, at 10:30 AM, David Peixotto
> <dpeixott at codeaurora.org> wrote:
> >>>>>> +// Check to see if we can convert the instruction to a version
> >>>>>> +that // supports writeback to the base register.
> >>>>>> +// We can convert a VLDR/VSTR to VLDM/VSTM as long as the
> >>>>>> +VLDR/VSTR do // not have any pre-increment in the instruction
> >>>>>> +(i.e. the imm in the // instruction should be zero).
> >>>>>> +static bool canConvertToWriteBackInstr(const MachineInstr& MI) {
> >>>>>> +  unsigned Opcode = MI.getOpcode();
> >>>>>> +  switch (Opcode) {
> >>>>>> +  default: return false;
> >>>>>> +  case ARM::VLDRS:
> >>>>>> +  case ARM::VSTRS:
> >>>>>> +  case ARM::VLDRD:
> >>>>>> +  case ARM::VSTRD:{
> >>>>>> +    // Check to make sure we have an instruction of the form
> >>>>>> +    // VLDR <Dd>, [<Rn> {, #+-<imm>]
> >>>>>> +    if (MI.getNumOperands() < 3)
> >>>>>> +      return false;
> >>>>>> +    if (!MI.getOperand(1).isReg()) // Base address
> >>>>>> +      return false;
> >>>>>
> >>>>> What conditions are these statements trying to check for? If they
> >>>>> ever
> >>>> fire,
> >>>>> something has gone seriously wrong, as that would indicate a
> >>>>> malformed instruction.
> >>>>
> >>>> This was just an attempt at defensive programming. I agree that
> >>>> they would indicate a malformed instruction. I'll remove the checks.
> >>>
> >>> I would very much appreciate seeing those checks in the code. They can
> be asserts if Jim insists that they will hold, which is not obvious to
anyone
> reading the ARM backend.
> >>
> >> If those checks don't hold, we have much, much bigger problems. If they
> fail, we don't just have buggy generated code, we have a completely
> malformed instructions. We already have asserts about them in various
> places. For example, the getReg() accessor method first has an
> assert(isReg()) in it. Likewise, the machine verifier will check for this
sort of
> thing.
> >
> > We can all agree that you don't need to assert isReg just before
> > calling getReg :)
> >
> > - David isn't calling getReg
> >
> That's just an example of how and where these sorts of assumptions are
> checked. If you really want to be pedantic, then that exact accessor will
be
> used for the register operands by the instruction printer, and we'll get
an
> assert there, if not earlier, if it's not a register.
> 
> > - The peephole is creating a potentially well-formed but incorrect
> > VLDM based on stale assumptions about VLDR
> >
> 
> If the VLDR is ill-formed, making a bad VLDM is the least of our concerns.
> 
> > - All transformations should either bailout or assert if they find an
"extra"
> implicit operand. It would be nice to get rid of implicit operands, but
they still
> exist. When we drop them, we are unlikely to catch the bug, ever. Is that
a
> good argument for pretending the bug doesn't exist?
> >
> 
> Implicit operands are a completely different matter. These operands are
not
> implicit. They are the ordinary garden variety operands. If they're not
> present or are not of the correct type something went *seriously* wrong
> elsewhere and there's absolutely nothing we can do about it here. The
> machine verifier is where those checks should (and as far as I'm aware,
do)
> happen. We can, and do, assume that instructions are well formed all the
> time.
> 
> VLDR and VSTR are totally generic instructions. There's nothing weird
going
> on with them like variable length register lists, implicit defs or
anything of
> that nature. They have a fixed set of operands which must be of the
correct
> type as specified by the instruction description.
> 
> >>> In general, we should be able to change instruction layouts or add
> implicit operands in the future without having arbitrary peephole opts
> silently generate incorrect code.
> >>>
> >>>>>> +
> >>>>>> +  // Transfer any extra operands (e.g. implicit defs/kills)  for
> >>>>>> + (unsigned int i = 5; i < MI->getNumOperands(); ++i)
> >>>>>> +    MIB.addOperand(MI->getOperand(i));
> >>>>>> +
> >>>>>
> >>>>> What cases have you encountered where a VLDR/VSTR instruction
> has
> >>>>> any additional operands that need to be copied?
> >>>>
> >>>> I had taken this code from elsewhere in the file. Perhaps it is not
> >>>> necessary. The one time I can think of it having additional
> >>>> operations would be an implicit def of a D register when loading an
> >>>> S register. I'm not sure if those are modeled like that, so I'm
> >>>> happy to remove it if it looks unnecessary to you.
> >>>
> >>> You definitely need to check for additional operands before doing this
> transformation. The question is do you copy them, bailout, or assert. To
> answer that question we need to know where else we explicitly verify the
> lack of additional operands. I'm not yet aware of any place.
> >>
> >> When will a VLDR or a VSTR instruction have additional operands other
> than the MemOperands, which are copied over afterwards?
> >
> >
> > Where and how do we verify that they don't?
> 
> I'm obviously not being clear. What sorts of additional operands can ever
be
> valid on a VLDR instruction? If they can't ever exist, we shouldn't bother
> checking for them.
> 
> I'll ask again. Please, anyone, give me an example of what we're checking
for
> here. All I'm hearing is "well, maybe it could happen somehow." Maybe it
> can. I'm not convinced either way. If we don't have a concrete example of
> how they should be there, and be correct, then it's a bug for them to ever
be
> there and we should assert if we see them. I want clarity one direction or
the
> other.

Ok, I have found a case where VSTR has an extra implicit operand. The extra
operand is an implicit use of a D register when the VSTR is storing an S
register. The bitcode for the example looks like :

define float* @foo(float* noalias nocapture %out, float %a, float %b)
nounwind {
entry:
  %c = fadd float %a, %b
  store float %c, float* %out, align 4
  %ptr = getelementptr float* %out, i32 2
  ret float* %ptr
}

Compiling with llc -mcpu=cortex-a8 produces this assembly:

foo:                                    @ @foo
@ BB#0:                                 @ %entry
        vmov    d16, r2, r2
        vmov    d17, r1, r1
        vadd.f32        d0, d17, d16
        vstr    s0, [r0]
        add     r0, r0, #8
        bx      lr

And if I dump the machine instructions, I see the following (note this is
llvm tip without my patches):

# After ARM load / store optimizer:
# Machine code for function foo: Post SSA
Function Live Ins: %R0 in %vreg0, %R1 in %vreg1, %R2 in %vreg2
Function Live Outs: %R0

BB#0: derived from LLVM BB %entry
    Live Ins: %R0 %R1 %R2
        %D16<def> = VMOVDRR %R2<kill>, %R2, pred:14, pred:%noreg
        %D17<def> = VMOVDRR %R1<kill>, %R1, pred:14, pred:%noreg
        %D0<def> = VADDfd %D17<kill>, %D16<kill>, pred:14, pred:%noreg
        VSTRS %S0, %R0, 0, pred:14, pred:%noreg, %D0<imp-use,kill>;
mem:ST4[%out]
        %R0<def> = ADDri %R0<kill>, 8, pred:14, pred:%noreg, opt:%noreg
        BX_RET pred:14, pred:%noreg, %R0<imp-use>

The VSTRS has %D0 as an implicit use.

With my current patches, I was copying the extra operands when converting
the VSTR to VSTM. Doing so actually causes a problem when computing the size
of the VSTM transfer, which is computed in the getLSMultipleTransferSize()
function as

    return (MI->getNumOperands() - MI->getDesc().getNumOperands() + 1) * 4;

The extra implicit operand causes the transfer size to be computed as 8
instead of 4. I have fixed this locally by computing transfer size based on
MI->getNumExplicitOperands().

So now that we have an example where the VSTR can have implicit operands,
how should I proceed? It seems to me the proper thing to do is to copy the
implicit operands, but I guess the concern is that VSTM are not expected to
have implicit operands. One thing that is unclear to me is how the implicit
operands are actually used? Is there some documentation or an explanation of
the implicit operands I can refer to?

Thanks for your thoughts and help on this matter.

-Dave

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation




More information about the llvm-commits mailing list