<p dir="ltr">Hi Jim,<br>

Thanks for the feedback!</p>

<p dir="ltr">I'll try to move these changes to the backend.</p>

<p dir="ltr">Cheers,<br>

Andrea<br>

</p>

<div class="gmail_quote">On 8 May 2014 03:18, "Jim Grosbach" <<a href="mailto:grosbach@apple.com">grosbach@apple.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi Andrea,<br>

<br>

I’m really excited to see these patches continuing. Our vector codegen has been needing exactly this sort of detail oriented tuning for a long time now.<br>

<br>

These are both good improvements, but would be better as DAGCombines in the X86 backend. The main argument for doing these intrinsic combines at the IR level is when the input expression is likely to be split across multiple basic blocks by the time the backend sees it and would thus not be recognized by a DAG combiner. Both of these transforms should avoid that problem, though, and so can be dealt with there.<br>


<br>

-Jim<br>

<br>

On May 7, 2014, at 8:42 AM, Andrea Di Biagio <<a href="mailto:andrea.dibiagio@gmail.com">andrea.dibiagio@gmail.com</a>> wrote:<br>

<br>

> Hi,<br>

><br>

> This patch teaches InstCombine how to fold a packed SSE2/AVX2 shift<br>

> intrinsic into its first operand if the shift count is a zerovector<br>

> (i.e. a 'ConstantAggregateZero’).<br>

> Also, this patch teaches InstCombine how to lower a packed arithmetic<br>

> shift intrinsics into an 'ashr' instruction if the shift count is<br>

> known to be smaller than the vector element size.<br>

><br>

> Please let me know if ok to submit.<br>

><br>

> Thanks,<br>

> Andrea Di Biagio<br>

> SN Systems - Sony Computer Entertainment Group<br>

> <patch-instcombine-vshifts.diff><br>

<br>

</blockquote></div>