[PATCH] [GVN] Eliminate redundant loads whose addresses are dependent on the result of a select instruction.

Wed Mar 25 16:18:59 PDT 2015

I generally have no idea of the tradeoffs in terms of non-PRE/non-GVN/etc
optimizations.
To whit:
GVN will not do anything with selects on it's own.
It does not, on it's own, simplify them
PRE the same.
GVN tends to end up simplifying them simply by virtue of equality
propagation/etc replacing the use in the select. But it's not guaranteed.

I can state definitively that these kinds of optimizations would generally
benefit from a non-select representation.

One could teach opts like GVN/PRE to view selects as fake little CFG's, and
produce a fake CFG that expands the select and have these opts use that
instead.
No idea what kind of overhead this would incur, and they'd have to know it
was fake, because they have to know how to make it back into a select :)

On the other hand, I expect vectorization/et al would do better with
selects.

You could teach these to basically treat simple SESE regions as
instructions. The downside is they also have to transform them back into
selects, which means they have to have the logic to kick all the non-select
instructions back above/below the region :)

On Wed, Mar 25, 2015 at 3:45 PM, Geoff Berry <gberry at codeaurora.org> wrote:

> Not to hijack this discussion, but this is related to something I’ve been
> looking at (and have discussed with Hal and James previously), which is
> select vs. phi canonicalization.
>
>
>
> Currently SimplifyCFG does most of the converting of phis to selects in
> IR.  There is also early and late if-conversion in some backends, but let’s
> ignore that for now.
>
> CodeGenPrepare does the reverse (i.e. select -> phi/branch) for some
> targets and some selects.
>
>
>
> SimplifyCFG uses a target instruction cost model of the speculated
> instructions to decide whether to do phi -> select conversion.  It
> currently uses a cutoff of 2*Basic if I recall correctly.
>
> I’d like to start a discussion on what the “right” canonical
> representation of these operations is over the phases of optimization.  It
> seems to me that keeping branches around longer would be better since it
> would allow global optimizations to only have to worry about phis.  It
> would also create blocks where instructions could be sunk.  For example:
>
>
>
> int selects(int a, int b, int c, int d) {
>
>      int x1, x2, x3;
>
>      x1 = a / b;
>
>      x2 = b / c;
>
>      x3 = c / d;
>
>      if (a < b) {
>
>          x1 = 0;
>
>          x2 = 0;
>
>          x3 = 0;
>
>      }
>
>      return x1 + x2 + x3;
>
>
>
> If we don’t convert the above to selects, then the CodeSinking pass is
> able to sink the divides into an else block that it creates.  (As an aside
> this brings up the question of canonical location of instructions w.r.t.
> sinking/forwarding, which I haven’t seen discussed either).
>
>
>
> I believe the downside to keeping phis/branches in the IR (someone please
> correct me if I’m wrong here), is that we break up the scope of later
> non-global passes.  I don’t have a feel for how big of a problem this is or
> how hard it would be to fix by e.g. extending these passes to work on
> single-entry single-exit regions or even superblocks.
>
>
>
> Any thoughts on this?
>
>
>
> --
>
> Geoff Berry
>
> Employee of Qualcomm Innovation Center, Inc.
>
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
> Foundation Collaborative Project
>
>
>
> *From:* Daniel Berlin [mailto:dberlin at dberlin.org]
> *Sent:* Wednesday, March 25, 2015 10:25 AM
> *To:* reviews+D8120+public+d2c9388bc8837c4c at reviews.llvm.org
> *Cc:* tilmann.scheller at googlemail.com; Nick Lewycky; David Majnemer; Hal
> Finkel; Philip Reames; James Molloy; mssimpso at codeaurora.org;
> gberry at codeaurora.org; Commit Messages and Patches for LLVM
> *Subject:* Re: [PATCH] [GVN] Eliminate redundant loads whose addresses
> are dependent on the result of a select instruction.
>
>
>
> Out of curiosity, at what phase do you see these redundancies?
>
> GCC used to canonicalize on the CFG version and do select formation late
> specifically to address some of these issues.
>
>
>
> On Wed, Mar 25, 2015 at 7:20 AM, Chad Rosier <mcrosier at codeaurora.org>
> wrote:
>
> I've abandoned this patch (originally abandoned on March 16th).  Overall,
> I saw no significant performance improvements across SPEC2K/SPEC2K6 and
> after further testing there was one significant regression (i.e., 6% on
> spec2000/vpr), which was the workload I was targeting.  While the patch
> does remove the redundant load, that redundancy was replaced by another set
> of redundant instructions, fcmp/csel.  IIRC, csel instructions are bad for
> A57 devices in general.
>
> I agree that a more general solution should be considered and if the new
> PRE pass can enable that solution, I'm in no rush to push this patch
> forward.
>
> Regardless, I do appreciate everyones feedback!
>
>
>
> http://reviews.llvm.org/D8120
>
> EMAIL PREFERENCES
>   http://reviews.llvm.org/settings/panel/emailpreferences/
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150325/7778cf9b/attachment.html>