submit [PATCH] SimplifyCFG for code review

Fri Jun 21 00:40:43 PDT 2013

On 21 June 2013 06:38, Ye, Mei <Mei.Ye at amd.com> wrote:

>  This transformation reduces branches.  It can benefit architectures with
> deep pipelines, less powerful branch predictors, or branch divergence on
> GPUs. I did have a GPU benchmark that shows roughly 1.5X performance
> improvement.
>

That is a big improvement on a specific benchmark on a very narrow category
of targets. I share Evan's concerns, as quite often what's good for GPUs is
not for CPUs and vice versa.

But on the other hand, there is probably very few optimizations that can
> benefit all architectures.  And it is also unrealistic to have performance
> measurement on all architectures to justify an optimization item.    What I
> am seeing is that compiler vendors have a tendency to push codes into their
> target space as much as possible, often at the expense of code quality that
> minimizes code-sharing and increases compilation time.
>

Indeed, and it's the job of the maintainers to make sure they get laid down
properly. As it stands, I think it could bring more harm than good, and you
haven't provided much information to say otherwise.

There is definitely a need to enable target-specific tuning in
> machine-independent optimizations.  Is there a guide line on a good
> approach to do this?  I have seen some cases that rely on threshold tuning,
> which can be non-deterministic and therefore unreliable.
>

Normally what people do is to add it as a pass, add a flag disabled by
default, and use it on their specific problems. If more and more people
find it useful, some targets or special configurations can turn it on by
default on front-ends, or when special configurations are found (ex. when
NEON/SSE is present and the pipeline is this or that way).

Whatever you add by default has to be proven beneficial on *most*
configurations of *most* targets, not on a single GPU implementation. When
that happens, you normally see an improvement of 1% or less, not 50%, but
what matters more is that there are no regressions in performance. If there
is, it can't be enabled by default.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130621/85a78126/attachment.html>