[PATCH] - Improve widening of 3 element binary vector operations that don't trap

Mon Aug 19 14:23:28 PDT 2013

On 2013-08-19 3:04 PM, "Nadav Rotem" <nrotem at apple.com> wrote:

>
>On Aug 19, 2013, at 11:30 AM, Redmond, Paul <paul.redmond at intel.com>
>wrote:
>
>> Hi Nadav,
>> 
>> On 2013-08-19 1:00 PM, "Nadav Rotem" <nrotem at apple.com> wrote:
>> 
>>> Hi Paul, 
>>> 
>>> This patch looks good, but I am a little worried about denormals.  With
>>> this patch we will execute vector operations on garbage at the fourth
>>> vector element.  One possible solution would be to mask out the last
>>> element. Does that sound right to you ?
>> 
>> That's a good question. I don't have a good answer right now. What about
>> handling the floating point operations in WidenVecRes_BinaryCanTrap for
>> now? (only the integral types are widened)
>> 
>
>I am okay with this solution.

I've committed this in r188699.

However, I just looked at this code again and it seems the denormal
problem existed before (or perhaps isn't a problem?). For example,
consider a v3f32 add on x86: In WidenVecRes_BinaryCanTrap WidenVT will be
v4f32 and operation will be widened (since canOpTrap() is false) There
seems to be an assumption that unused elements are already masked out.

Thoughts?

paul

>
>Thanks,
>Nadav