submit [PATCH] SimplifyCFG for code review

Ye, Mei Mei.Ye at amd.com
Thu Jun 20 22:38:29 PDT 2013


Hi Evan

This transformation reduces branches.  It can benefit architectures with deep pipelines, less powerful branch predictors, or branch divergence on GPUs.
I did have a GPU benchmark that shows roughly 1.5X performance improvement.

But on the other hand, there is probably very few optimizations that can benefit all architectures.  And it is also unrealistic to have performance measurement on all architectures to justify an optimization item.    What I am seeing is that compiler vendors have a tendency to push codes into their target space as much as possible, often at the expense of code quality that minimizes code-sharing and increases compilation time.

There is definitely a need to enable target-specific tuning in machine-independent optimizations.  Is there a guide line on a good approach to do this?  I have seen some cases that rely on threshold tuning, which can be non-deterministic and therefore unreliable.

-Mei




From: Evan Cheng [mailto:evan.cheng at apple.com]
Sent: Thursday, June 20, 2013 4:51 PM
To: Ye, Mei
Cc: Sean Silva; llvm-commits at cs.uiuc.edu
Subject: Re: submit [PATCH] SimplifyCFG for code review

Hi Mei,

I have concerns about the general approach. It's not clear to me this kind of transformation is generally a win on all modern architectures. As an example from your code:

+/// Before:
+///   ......
+///   %cmp10 = fcmp une float %tmp1, %tmp2
+///   br i1 %cmp1, label %if.then, label %lor.rhs
+///
+/// lor.rhs:
+///   ......
+///   %cmp11 = fcmp une float %tmp3, %tmp4
+///   br i1 %cmp11, label %if.then, label %ifend
+///
+/// if.end:  // the merge block
+///   ......
+///
+/// if.then: // has two predecessors, both of them contains conditional branch.
+///   ......
+///   br label %if.end;
+///
+/// After:
+///  ......
+///  %cmp10 = fcmp une float %tmp1, %tmp2
+///  ......
+///  %cmp11 = fcmp une float %tmp3, %tmp4
+///  %cmp12 = or i1 %cmp10, %cmp11    // parallel-or mode.
+///  br i1 %cmp12, label %if.then, label %ifend
+///
+///  if.end:
+///    ......
+///
+///  if.then:
+///    ......
+///    br label %if.end;

The pre-transformed code is probably going to run faster than the transformed code on a modern Intel x86 processor. On ARM processors, saving the value of conditional registers and or'ing them is probably also quite expensive. I also suspect the transformation will prohibit the target from some if-conversion opportunities.

Have you thought about having targets providing cost information to the pass? Alternatively this may be done as a codegen pass?

Evan


On Jun 20, 2013, at 4:26 PM, "Ye, Mei" <Mei.Ye at amd.com<mailto:Mei.Ye at amd.com>> wrote:


Hi Sean

See the attached new patch, which fixes all issues brought up by you.
I ran "clang-format" on whole files which makes more diffs.  Hopefully this is OK with you.
It will be nice if the check-in process can automatically run "clang-format".

-Mei

From: Sean Silva [mailto:silvas at purdue.edu<http://purdue.edu>]
Sent: Tuesday, June 18, 2013 4:48 PM
To: Ye, Mei
Cc: llvm-commits at cs.uiuc.edu<mailto:llvm-commits at cs.uiuc.edu>
Subject: Re: submit [PATCH] SimplifyCFG for code review



On Tue, Jun 18, 2013 at 4:27 PM, Ye, Mei <Mei.Ye at amd.com<mailto:Mei.Ye at amd.com>> wrote:
Hi Sean

There are some considerations that go into this test sample to create an opportunity for the transformation.
The test case should run *only* the simplifycfg pass (I see now that you have `-O3` in your test case; please remove `-O3`). You can probably just write a simple .ll function with IR that looks like the "Before" case in your documentation comment.

Also,

+; RUN: opt < %s -O3 -simplifycfg -S | grep br | count 2
+%struct.float2_tag = type { float, float }

Please use FileCheck for all new tests. Using grep is deprecated.

-- Sean Silva
<odc_simplifycfg>_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu<mailto:llvm-commits at cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130621/9a43c035/attachment.html>


More information about the llvm-commits mailing list