[llvm] r227983 - Fix program crashes due to alignment exceptions generated for SSE memop instructions (PR22371).

Thu Feb 5 09:12:45 PST 2015

I'm still trying to figure out whether this is safe to merge for 3.6.

On Thu, Feb 5, 2015 at 8:03 AM, Sanjay Patel <spatel at rotateright.com> wrote:
> The intent of r227983 is simply to revert the wrong behavior introduced by
> r224330 for *SSE* codegen.
> If r224330 was supposed to change behavior of AVX512, it should have
> included test cases to validate that, right? These should be separate
> patches.
>
>> You removed FeatureVectorUAMem form HSW, but you did not add
>> FeatureSSEUnalignedMem
>
> Correct - AFAIK, Haswell doesn't have this feature. Perhaps we need to
> change 'memop' to 'sse_memop' to make it clearer that memops are only for
> use with SSE...or just modify the 'load' defs to include this predicate
> somehow:
>
> // Like 'load', but uses special alignment checks suitable for use in
> // memory operands in most SSE instructions, which are required to
> // be naturally aligned on some targets but not on others.  If the subtarget
> // allows unaligned accesses, match any load, though this may require
> // setting a feature bit in the processor (on startup, for example).
> // Opteron 10h and later implement such a feature.
> def memop : PatFrag<(ops node:$ptr), (load node:$ptr), [{
>   return    Subtarget->hasSSEUnalignedMem()
>          || cast<LoadSDNode>(N)->getAlignment() >= 16;
> }]>;
>
>
>
>
> On Thu, Feb 5, 2015 at 1:11 AM, Demikhovsky, Elena
> <elena.demikhovsky at intel.com> wrote:
>>
>> It can use load, I agree.
>>
>> But AVX512 code should be changed before committing the patch.
>>
>>
>>
>> -           Elena
>>
>>
>>
>> From: Craig Topper [mailto:craig.topper at gmail.com]
>> Sent: Thursday, February 05, 2015 10:04
>>
>>
>> To: Demikhovsky, Elena
>> Cc: Sanjay Patel; llvm-commits
>> Subject: Re: [llvm] r227983 - Fix program crashes due to alignment
>> exceptions generated for SSE memop instructions (PR22371).
>>
>>
>>
>> Why is AVX512 not using load?
>>
>>
>>
>> On Wed, Feb 4, 2015 at 11:59 PM, Demikhovsky, Elena
>> <elena.demikhovsky at intel.com> wrote:
>>
>> We use it in AVX512.
>>
>>
>>
>> -           Elena
>>
>>
>>
>> From: Craig Topper [mailto:craig.topper at gmail.com]
>> Sent: Thursday, February 05, 2015 09:52
>> To: Demikhovsky, Elena
>> Cc: Sanjay Patel; llvm-commits
>>
>>
>> Subject: Re: [llvm] r227983 - Fix program crashes due to alignment
>> exceptions generated for SSE memop instructions (PR22371).
>>
>>
>>
>> I thought we had converted all of AVX/AVX2 to use 'loadXXX' instead of
>> 'memopXXX'?
>>
>>
>>
>> On Wed, Feb 4, 2015 at 11:44 PM, Demikhovsky, Elena
>> <elena.demikhovsky at intel.com> wrote:
>>
>> Hi,
>> You removed FeatureVectorUAMem form HSW, but you did not add
>> FeatureSSEUnalignedMem
>>                                       FeatureAVX2,
>>                                       FeatureCMPXCHG16B,
>>                                       FeatureFastUAMem,
>> -                                     FeatureVectorUAMem,
>>                                       FeaturePOPCNT,
>>                                       FeatureAES,
>>                                       FeaturePCLMUL,
>>
>> How memop should work now on AVX/AVX2? Will it require alignment now?
>>
>> def memop : PatFrag<(ops node:$ptr), (load node:$ptr), [{
>> -  return    Subtarget->hasVectorUAMem()
>> +  return    Subtarget->hasSSEUnalignedMem()
>>           || cast<LoadSDNode>(N)->getAlignment() >= 16;
>>  }]>;
>>
>> -  Elena
>>
>>
>> -----Original Message-----
>> From: Kuperstein, Michael M
>> Sent: Wednesday, February 04, 2015 21:41
>> To: Hans Wennborg; Nadav Rotem; Demikhovsky, Elena
>> Cc: Sanjay Patel; llvm-commits
>> Subject: RE: [llvm] r227983 - Fix program crashes due to alignment
>> exceptions generated for SSE memop instructions (PR22371).
>>
>> I understand this on roughly the same level as Nadav does (looks correct,
>> and should be safer than the current situation), but Elena should be able to
>> give an authoritative answer.
>> Elena, could you comment?
>>
>> Michael
>>
>> -----Original Message-----
>> From: hwennborg at google.com [mailto:hwennborg at google.com] On Behalf Of Hans
>> Wennborg
>> Sent: Wednesday, February 04, 2015 20:53
>> To: Nadav Rotem
>> Cc: Sanjay Patel; llvm-commits; Demikhovsky, Elena; Kuperstein, Michael M
>> Subject: Re: [llvm] r227983 - Fix program crashes due to alignment
>> exceptions generated for SSE memop instructions (PR22371).
>>
>> Elena, Michael: is this safe for merging to 3.6?
>>
>> As far as I understand it's basically a revert or r224330.
>>
>> On Tue, Feb 3, 2015 at 3:44 PM, Nadav Rotem <nrotem at apple.com> wrote:
>> > The change looks good to me, but I would wait a day and let Elena or
>> > Michael review it.
>> >
>> > On Feb 3, 2015, at 3:19 PM, Hans Wennborg <hans at chromium.org> wrote:
>> >
>> > On Tue, Feb 3, 2015 at 9:13 AM, Sanjay Patel <spatel at rotateright.com>
>> > wrote:
>> >
>> > Author: spatel
>> > Date: Tue Feb  3 11:13:04 2015
>> > New Revision: 227983
>> >
>> > URL: http://llvm.org/viewvc/llvm-project?rev=227983&view=rev
>> > Log:
>> > Fix program crashes due to alignment exceptions generated for SSE
>> > memop instructions (PR22371).
>> >
>> > r224330 introduced a bug by misinterpreting the "FeatureVectorUAMem"
>> > bit.
>> > The commit log says that change did not affect anything, but that's
>> > not correct.
>> > That change allowed SSE instructions to have unaligned mem operands
>> > folded into math ops, and that's not allowed in the default
>> > specification for any SSE variant.
>> >
>> > The bug is exposed when compiling for an AVX-capable CPU that had this
>> > feature flag but without enabling AVX codegen. Another mistake in
>> > r224330 was not adding the feature flag to all AVX CPUs; the AMD chips
>> > were excluded.
>> >
>> > This is part of the fix for PR22371 (
>> > http://llvm.org/bugs/show_bug.cgi?id=22371 ).
>> >
>> > This feature bit is SSE-specific, so I've renamed it to
>> > "FeatureSSEUnalignedMem".
>> > Changed the existing test case for the feature bit to reflect the new
>> > name and renamed the test file itself to better reflect the feature.
>> > Added runs to fold-vex.ll to check for the failing codegen.
>> >
>> > Note that the feature bit is not set by default on any CPU because it
>> > may require a configuration register setting to enable the enhanced
>> > unaligned behavior.
>> >
>> >
>> > Sanjay requested in [1] that we should merge this to the 3.6 branch.
>> > I'm not qualified to determine if this is safe or not. Is there somone
>> > else with X86 expertise that would like to chime in? Nadav, you're
>> > listed as the code owner here.
>> >
>> > Thanks,
>> > Hans
>> >
>> > 1. http://llvm.org/bugs/show_bug.cgi?id=22374#c3
>> >
>> >
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>>
>>
>>
>> --
>>
>> ~Craig
>>
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>>
>>
>>
>>
>> --
>>
>> ~Craig
>>
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>
>