<div dir="ltr"><br><br><div class="gmail_quote">On Tue, May 5, 2015 at 2:46 PM Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">----- Original Message -----<br>

> From: "Eric Christopher" <<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>><br>

> To: "escha" <<a href="mailto:escha@apple.com" target="_blank">escha@apple.com</a>>, "llvm-commits" <<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a>><br>

> Sent: Monday, May 4, 2015 5:40:50 PM<br>

> Subject: Re: [PATCH/RFC] New TLI option for fast selects<br>

><br>

><br>

><br>

> This seems fairly reasonable, couple of nits:<br>

><br>

><br>

> a) Routine name: theoretically it should begin with a lower case<br>

> letter. I know it probably doesn't match anything around it then. I<br>

> don't know what we want to do about this, but I wouldn't complain<br>

> much.<br>

><br>

><br>

> b) Argument names: Can you make them a little more descriptive and<br>

> document them?<br>

><br>

><br>

> c) Got an in-tree user where this would be useful?<br>

><br>

<br>

Yes; on the PPC A2, integer selects are fast.<br>

<br></blockquote><div><br></div><div>Sweet. Thanks!</div><div><br></div><div>-eric</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

 -Hal<br>

<br>

><br>

> -eric<br>

><br>

> On Thu, Apr 30, 2015 at 1:47 PM escha < <a href="mailto:escha@apple.com" target="_blank">escha@apple.com</a> > wrote:<br>

><br>

><br>

> There’s a number of DAG transforms that are suboptimal on<br>

> architectures with extremely fast selects, e.g. those with actual<br>

> select instructions mapping relatively cleanly to select_cc. An<br>

> example would be integer abs(): on an ideal architecture without a<br>

> select it might be three instructions (cmp, sub, cmov), so the<br>

> three-operation canonicalization likely won’t hurt even in the worst<br>

> case. But on an architecture with a select, we’re going from 2<br>

> instructions to 3, which significantly increases instruction count,<br>

> and it’s difficult to “go back” from the new instruction sequence to<br>

> the select.<br>

><br>

> I’m not sure this patch actually caught all of them; there might be<br>

> others, since I didn’t check them all. My logic here was to put a<br>

> check on every transform which creates more nodes than it consumes<br>

> in order to eliminate a select. On an out of tree target this saves<br>

> a number of instructions (with no regressions on any test) by making<br>

> the included TLI return “false” for that target.<br>

><br>

> This could also open up more optimizations in the future that assume<br>

> selects are fast, e.g. one select DAG node is roughly equivalent to<br>

> one real instruction. I wonder if any of the in-tree GPU backends<br>

> would find something like this useful?<br>

><br>

> Any thoughts on the implementation?<br>

><br>

> — escha<br>

><br>

><br>

> _______________________________________________<br>

> llvm-commits mailing list<br>

> <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

><br>

> _______________________________________________<br>

> llvm-commits mailing list<br>

> <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

><br>

<br>

--<br>

Hal Finkel<br>

Assistant Computational Scientist<br>

Leadership Computing Facility<br>

Argonne National Laboratory<br>

</blockquote></div></div>