[llvm] r182023 - PPC32 cannot form counter loops around i64 FP conversions

Thu May 16 14:33:00 PDT 2013

Hal Finkel <hfinkel at anl.gov> wrote:

> > There are essentially two problems:
> >
> >  1. Generating an expression for the count.
> >  2. Using the counter-based loop instructions.
> >
> > As we've discovered, (1) should really be done at the IR level. Doing
> > this at the MI level essentially means reimplementing large parts of
> > SE at the MI level, and the resulting expressions often contain
> > min/maxs (which turn into selects), etc. and an entire custom code
> > generator for these expressions is also difficult to maintain and
> > produces suboptimal code.
> >
> > As for (2), this should really be done at the MI level. That's where
> > we can really detect interfering uses of the counter register (and
> > avoid problem such as this).
> >
> > So to solve problems with generating good (and correct) code for the
> > counts, I tried moving everything from the MI level into the IR
> > level. Maybe it is possible to do the count-expression generation at
> > the IR level and the actual loop conversion at the MI level. I'll
> > try to make it work this way.
>
> I should add: The IR-level pass can use SE to identify a countable
> backedge, and then insert a count. The problem now becomes, at the
> MI level, to make sure that we identify that same backedge for
> branch conversion (and to make sure that nothing in the mean time
> invalidates the count).

It seems to me that attempting to introduce this sort of "tight
coupling" between an IR pass and a later MI pass will probably
lead to problems as well.

I'd instead suggest to have two self-contained passes that are
only loosely coupled.  First, an IR pass recognizes likely CTR
loops and rewrites them on the IR level into counting-down loops;
that is a loop that uses regular IR to describe a counter being
set to an initial value, counting it down, and testing it against
zero as condition of the loop back-edge branch.  (This
transformation as such can never lead to wrong code generation
no matter what happens later.  In fact, I'd assume that there
are already loop optimizers that perform exactly this type of
transformation ...)

Later on, an MI pass detects loops that look on the MI level like
counting-down loops, and -if possible- allocates the counter into
the CTR register and then uses bdnz.  At this point there is no
need to attempt to rewrite the loop if it doesn't already look
like a counting-down loop.  (Again, this transformation is easy
to verify and can never lead to wrong code generation.  The worst
that could happen is that the MI no longer recognizes the loop
or isn't able to handle it; but then it will just get emitted
as a counting-down loop using "normal" instructions instead.)

Does this sound reasonable?

Bye,
Ulrich