[llvm] r174904 - Extend Hexagon hardware loop generation to handle various additional cases:

Mon May 6 06:19:14 PDT 2013

----- Original Message -----
> From: "Brendon Cahoon" <bcahoon at codeaurora.org>
> To: "Hal Finkel" <hfinkel at anl.gov>, "Krzysztof Parzyszek" <kparzysz at codeaurora.org>
> Cc: llvm-commits at cs.uiuc.edu
> Sent: Wednesday, May 1, 2013 1:41:01 PM
> Subject: RE: [llvm] r174904 - Extend Hexagon hardware loop generation to	handle various additional cases:
> 
> Hi Hal,
> 
> We recently had a bug opened internally on a very similar loop.  For
> us, the
> problem occurs when we set the hardware loop counter to 0, and the
> loop
> decrements the value in the first iteration.  If it's an unsigned
> value,
> then the loop should execute MAX_INT iterations, but our hardware
> loop
> instruction only executes the loop once.  If it's signed, I don't
> think it
> matters since the behavior is undefined.
> 
> I think you're on the right track about getting help from an IR level
> pass.
> I think it may be enough to have the signed/unsigned wrap flag
> propagated to
> the MI level for the induction variable decrement.  If the nuw/nsw
> flag is
> not there, then don't generate a hardware loop.  I'm not sure it's
> that a
> desirable solution though.

It took some time, but I now have this working as an IR level pass (plus some intrinsics and plus a cleanup pass to set live-in flags and
remove some dead use flags). I have some code cleanup to do, and I plan on committing this, probably tomorrow (after 3.3 branches).

 -Hal

> 
> Thanks,
> Brendon
> 
> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu
> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel
> ----- Original Message -----
> > From: "Krzysztof Parzyszek" <kparzysz at codeaurora.org>
> > On 5/1/2013 8:16 AM, Hal Finkel wrote:
> > >
> > > I realized my comment could be a little unclear. There is no loop
> > > guard for 2008-04-20-LoopBug2.c because there shouldn't be. It is
> > > a
> > > do-while loop, and so the first iteration is always executed.
> > > The problem is that the lack of the guard means nothing prevents
> > > us
> > > from generating negative count values.
> > >
> > > I can't think of any great solutions to this problem. We can
> > > check
> > > the predecessors for simple guards, but they might be difficult
> > > to
> > > find if they've been hoisted out of loops and/or combined with
> > > other
> > > conditions. Alternatively, maybe the counts should be generated
> > > by
> > > some IR-level pass (maybe using SCEV, and put into some
> > > intrinsic,
> > > selected to a pseudo instruction, that can either be converted by
> > > the hardware loops pass, or DCE'd).
> > 
> > I didn't have time to look into this problem yet, but given this
> > explanation I agree that there is a problem with unguarded loops.
> >  I'm
> > not sure what to do with it either.  We actually have thought that
> > MI-level SCEV would be nice, but that's where it ended.
> > 
> > What I'm thinking about at the moment is some sort of loop
> > transformation that would prepare loops for converting into HW
> > form,
> > but I have nothing concrete yet.
> > 
> > Thanks for pointing this out.  This definitely needs to be
> > addressed.
> 
> Alright, I'll work on this today. If I come up with something
> worthwhile,
> then you can be the copier this time ;)
> 
>  -Hal
> 
> > 
> > -Krzysztof
> 
> 
>