[PATCH] D17414: Fix incorrect selection of AVX512 sqrt when OptForSize is on

Mehdi Amini via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 19 11:25:04 PST 2016


> On Feb 19, 2016, at 11:19 AM, Dimitry Andric <dimitry at andric.com> wrote:
> 
> For this specific case, I would like to get a minimal fix in now, so it can hopefully be merged to the 3.8 branch before it freezes.

The fix makes sense separately.

> 
> But indeed, a more structural fix should probably be considered for trunk.  Is there actually any formal (or semi-formal) definition of the .td 'language' somewhere? :)

Pretty rough but still:

http://llvm.org/docs/TableGen/index.html
http://llvm.org/docs/TableGen/LangIntro.html
http://llvm.org/docs/TableGen/LangRef.html


-- 
Mehdi


> 
> -Dimitry
> 
>> On 19 Feb 2016, at 19:21, Jim Grosbach <grosbach at apple.com <mailto:grosbach at apple.com>> wrote:
>> 
>> That makes sense, but is really, really fragile in that it’s depending on the order of evaluation of the Requires<> vs. the enclosing “let Predicates = “. I don’t believe that’s required to be one order or another.
>> 
>> Also, yes, I believe the Predicates should be additive, not a replacement, by default. There should be a way to override and replace, but that’s the uncommon case.
>> 
>>> On Feb 19, 2016, at 10:19 AM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
>>> 
>>> So from what I observed (I may be wrong), the other pattern are inlined like that:
>>> 
>>> let Predicates = [HasAVX512] in {
>>>   def : Pat<(f32 (X86frsqrt (load addr:$src))),
>>>             (COPY_TO_REGCLASS (VRSQRT14SSrm (v4f32 (IMPLICIT_DEF)), addr:$src), VR128X)>,
>>>             Requires<[OptForSize]>;
>>> 
>>> The inner Requires<[OptForSize] is *ignored*. This is why I consider it as a misoptimization and not a correctness bug. They need to be fixed, but it should be a separate patch (and test case).
>>> 
>>> 
>>> The bug that this patch is fixing is different, there was *no* enclosing scope with a predicate when declaring:
>>> 
>>> defm VSQRT   : avx512_sqrt_scalar_all<0x51, "vsqrt">, VEX_LIG;
>>> 
>>> It refers to a multiclass "avx512_sqrt_scalar_all" which itself refers to "avx512_sqrt_scalar", which contains this pattern:
>>> 
>>>   def : Pat<(_.EltVT (OpNode (load addr:$src))),
>>>             (!cast<Instruction>(NAME#SUFF#Zm)
>>>                 (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[OptForSize]>;
>>> 
>>> 
>>> Here the OptForSize predicate is taken into account, but there is no other restriction.
>>> It is only an issue for this pattern, the instruction themselves in this multiclass are ultimately defined by the "class AVX512" which has "Requires<[HasAVX512]>".
>>> 
>>> Now it seems to me that a pattern that tries to generate an instruction should "inherit" the restriction of this instruction, that would prevent such bugs.
>>> 
>>> Hope it makes sense.
>>> 
>>> -- 
>>> Mehdi
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On Feb 19, 2016, at 10:06 AM, Jim Grosbach <grosbach at apple.com <mailto:grosbach at apple.com>> wrote:
>>>> 
>>>> If they’re overwriting the Predicates like this one was, they could have the same problem.
>>>> 
>>>>> On Feb 19, 2016, at 9:09 AM, Roman Divacky <rdivacky at vlakno.cz <mailto:rdivacky at vlakno.cz>> wrote:
>>>>> 
>>>>> What about the other cases of Requires[OptForSize] in the same file?
>>>>> 
>>>>> On Thu, Feb 18, 2016 at 10:11:59PM +0000, Dimitry Andric via llvm-commits wrote:
>>>>>> dim created this revision.
>>>>>> dim added reviewers: joker.eph, grosbach, resistor.
>>>>>> dim added a subscriber: llvm-commits.
>>>>>> 
>>>>>> When optimizing for size, sqrt calls can be incorrectly selected as
>>>>>> AVX512 VSQRT instructions.  This is because X86InstrAVX512.td has a
>>>>>> `Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass
>>>>>> definition.  Even if the target does not support AVX512, the class can
>>>>>> apparently still be chosen, leading to an incorrect selection of
>>>>>> `vsqrtss`.
>>>>>> 
>>>>>> In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <=
>>>>>> X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction
>>>>>> requires an XMM register, which is not available on i686 CPUs.
>>>>>> 
>>>>>> http://reviews.llvm.org/D17414 <http://reviews.llvm.org/D17414>
>>>>>> 
>>>>>> Files:
>>>>>>  lib/Target/X86/X86InstrAVX512.td
>>>>>>  test/CodeGen/X86/pr26625.ll
>>>>>> 
>>>>>> Index: test/CodeGen/X86/pr26625.ll
>>>>>> ===================================================================
>>>>>> --- /dev/null
>>>>>> +++ test/CodeGen/X86/pr26625.ll
>>>>>> @@ -0,0 +1,20 @@
>>>>>> +; RUN: llc < %s -mcpu=i686 2>&1 | FileCheck %s
>>>>>> +; PR26625
>>>>>> +
>>>>>> +target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
>>>>>> +target triple = "i386"
>>>>>> +
>>>>>> +define float @x0(float %f) #0 {
>>>>>> +entry:
>>>>>> +  %call = tail call float @sqrtf(float %f) #1
>>>>>> +  ret float %call
>>>>>> +; CHECK-LABEL: x0:
>>>>>> +; CHECK: flds
>>>>>> +; CHECK-NEXT: fsqrt
>>>>>> +; CHECK-NOT: vsqrtss
>>>>>> +}
>>>>>> +
>>>>>> +declare float @sqrtf(float) #0
>>>>>> +
>>>>>> +attributes #0 = { nounwind optsize readnone }
>>>>>> +attributes #1 = { nounwind optsize readnone }
>>>>>> Index: lib/Target/X86/X86InstrAVX512.td
>>>>>> ===================================================================
>>>>>> --- lib/Target/X86/X86InstrAVX512.td
>>>>>> +++ lib/Target/X86/X86InstrAVX512.td
>>>>>> @@ -6005,7 +6005,7 @@
>>>>>> 
>>>>>>   def : Pat<(_.EltVT (OpNode (load addr:$src))),
>>>>>>             (!cast<Instruction>(NAME#SUFF#Zm)
>>>>>> -                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[OptForSize]>;
>>>>>> +                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[HasAVX512, OptForSize]>;
>>>>>> }
>>>>>> 
>>>>>> multiclass avx512_sqrt_scalar_all<bits<8> opc, string OpcodeStr> {
>>>>>> 
>>>>>> 
>>>>> 
>>>>>> Index: test/CodeGen/X86/pr26625.ll
>>>>>> ===================================================================
>>>>>> --- /dev/null
>>>>>> +++ test/CodeGen/X86/pr26625.ll
>>>>>> @@ -0,0 +1,20 @@
>>>>>> +; RUN: llc < %s -mcpu=i686 2>&1 | FileCheck %s
>>>>>> +; PR26625
>>>>>> +
>>>>>> +target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
>>>>>> +target triple = "i386"
>>>>>> +
>>>>>> +define float @x0(float %f) #0 {
>>>>>> +entry:
>>>>>> +  %call = tail call float @sqrtf(float %f) #1
>>>>>> +  ret float %call
>>>>>> +; CHECK-LABEL: x0:
>>>>>> +; CHECK: flds
>>>>>> +; CHECK-NEXT: fsqrt
>>>>>> +; CHECK-NOT: vsqrtss
>>>>>> +}
>>>>>> +
>>>>>> +declare float @sqrtf(float) #0
>>>>>> +
>>>>>> +attributes #0 = { nounwind optsize readnone }
>>>>>> +attributes #1 = { nounwind optsize readnone }
>>>>>> Index: lib/Target/X86/X86InstrAVX512.td
>>>>>> ===================================================================
>>>>>> --- lib/Target/X86/X86InstrAVX512.td
>>>>>> +++ lib/Target/X86/X86InstrAVX512.td
>>>>>> @@ -6005,7 +6005,7 @@
>>>>>> 
>>>>>>   def : Pat<(_.EltVT (OpNode (load addr:$src))),
>>>>>>             (!cast<Instruction>(NAME#SUFF#Zm)
>>>>>> -                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[OptForSize]>;
>>>>>> +                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[HasAVX512, OptForSize]>;
>>>>>> }
>>>>>> 
>>>>>> multiclass avx512_sqrt_scalar_all<bits<8> opc, string OpcodeStr> {
>>>>> 
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160219/bcd73d8b/attachment.html>


More information about the llvm-commits mailing list