[PATCH] D17414: Fix incorrect selection of AVX512 sqrt when OptForSize is on

Jim Grosbach via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 19 10:21:52 PST 2016


That makes sense, but is really, really fragile in that it’s depending on the order of evaluation of the Requires<> vs. the enclosing “let Predicates = “. I don’t believe that’s required to be one order or another.

Also, yes, I believe the Predicates should be additive, not a replacement, by default. There should be a way to override and replace, but that’s the uncommon case.

> On Feb 19, 2016, at 10:19 AM, Mehdi Amini <mehdi.amini at apple.com> wrote:
> 
> So from what I observed (I may be wrong), the other pattern are inlined like that:
> 
> let Predicates = [HasAVX512] in {
>   def : Pat<(f32 (X86frsqrt (load addr:$src))),
>             (COPY_TO_REGCLASS (VRSQRT14SSrm (v4f32 (IMPLICIT_DEF)), addr:$src), VR128X)>,
>             Requires<[OptForSize]>;
> 
> The inner Requires<[OptForSize] is *ignored*. This is why I consider it as a misoptimization and not a correctness bug. They need to be fixed, but it should be a separate patch (and test case).
> 
> 
> The bug that this patch is fixing is different, there was *no* enclosing scope with a predicate when declaring:
> 
> defm VSQRT   : avx512_sqrt_scalar_all<0x51, "vsqrt">, VEX_LIG;
> 
> It refers to a multiclass "avx512_sqrt_scalar_all" which itself refers to "avx512_sqrt_scalar", which contains this pattern:
> 
>   def : Pat<(_.EltVT (OpNode (load addr:$src))),
>             (!cast<Instruction>(NAME#SUFF#Zm)
>                 (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[OptForSize]>;
> 
> 
> Here the OptForSize predicate is taken into account, but there is no other restriction.
> It is only an issue for this pattern, the instruction themselves in this multiclass are ultimately defined by the "class AVX512" which has "Requires<[HasAVX512]>".
> 
> Now it seems to me that a pattern that tries to generate an instruction should "inherit" the restriction of this instruction, that would prevent such bugs.
> 
> Hope it makes sense.
> 
> -- 
> Mehdi
> 
> 
> 
> 
> 
>> On Feb 19, 2016, at 10:06 AM, Jim Grosbach <grosbach at apple.com <mailto:grosbach at apple.com>> wrote:
>> 
>> If they’re overwriting the Predicates like this one was, they could have the same problem.
>> 
>>> On Feb 19, 2016, at 9:09 AM, Roman Divacky <rdivacky at vlakno.cz <mailto:rdivacky at vlakno.cz>> wrote:
>>> 
>>> What about the other cases of Requires[OptForSize] in the same file?
>>> 
>>> On Thu, Feb 18, 2016 at 10:11:59PM +0000, Dimitry Andric via llvm-commits wrote:
>>>> dim created this revision.
>>>> dim added reviewers: joker.eph, grosbach, resistor.
>>>> dim added a subscriber: llvm-commits.
>>>> 
>>>> When optimizing for size, sqrt calls can be incorrectly selected as
>>>> AVX512 VSQRT instructions.  This is because X86InstrAVX512.td has a
>>>> `Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass
>>>> definition.  Even if the target does not support AVX512, the class can
>>>> apparently still be chosen, leading to an incorrect selection of
>>>> `vsqrtss`.
>>>> 
>>>> In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <=
>>>> X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction
>>>> requires an XMM register, which is not available on i686 CPUs.
>>>> 
>>>> http://reviews.llvm.org/D17414 <http://reviews.llvm.org/D17414>
>>>> 
>>>> Files:
>>>>  lib/Target/X86/X86InstrAVX512.td
>>>>  test/CodeGen/X86/pr26625.ll
>>>> 
>>>> Index: test/CodeGen/X86/pr26625.ll
>>>> ===================================================================
>>>> --- /dev/null
>>>> +++ test/CodeGen/X86/pr26625.ll
>>>> @@ -0,0 +1,20 @@
>>>> +; RUN: llc < %s -mcpu=i686 2>&1 | FileCheck %s
>>>> +; PR26625
>>>> +
>>>> +target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
>>>> +target triple = "i386"
>>>> +
>>>> +define float @x0(float %f) #0 {
>>>> +entry:
>>>> +  %call = tail call float @sqrtf(float %f) #1
>>>> +  ret float %call
>>>> +; CHECK-LABEL: x0:
>>>> +; CHECK: flds
>>>> +; CHECK-NEXT: fsqrt
>>>> +; CHECK-NOT: vsqrtss
>>>> +}
>>>> +
>>>> +declare float @sqrtf(float) #0
>>>> +
>>>> +attributes #0 = { nounwind optsize readnone }
>>>> +attributes #1 = { nounwind optsize readnone }
>>>> Index: lib/Target/X86/X86InstrAVX512.td
>>>> ===================================================================
>>>> --- lib/Target/X86/X86InstrAVX512.td
>>>> +++ lib/Target/X86/X86InstrAVX512.td
>>>> @@ -6005,7 +6005,7 @@
>>>> 
>>>>   def : Pat<(_.EltVT (OpNode (load addr:$src))),
>>>>             (!cast<Instruction>(NAME#SUFF#Zm)
>>>> -                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[OptForSize]>;
>>>> +                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[HasAVX512, OptForSize]>;
>>>> }
>>>> 
>>>> multiclass avx512_sqrt_scalar_all<bits<8> opc, string OpcodeStr> {
>>>> 
>>>> 
>>> 
>>>> Index: test/CodeGen/X86/pr26625.ll
>>>> ===================================================================
>>>> --- /dev/null
>>>> +++ test/CodeGen/X86/pr26625.ll
>>>> @@ -0,0 +1,20 @@
>>>> +; RUN: llc < %s -mcpu=i686 2>&1 | FileCheck %s
>>>> +; PR26625
>>>> +
>>>> +target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
>>>> +target triple = "i386"
>>>> +
>>>> +define float @x0(float %f) #0 {
>>>> +entry:
>>>> +  %call = tail call float @sqrtf(float %f) #1
>>>> +  ret float %call
>>>> +; CHECK-LABEL: x0:
>>>> +; CHECK: flds
>>>> +; CHECK-NEXT: fsqrt
>>>> +; CHECK-NOT: vsqrtss
>>>> +}
>>>> +
>>>> +declare float @sqrtf(float) #0
>>>> +
>>>> +attributes #0 = { nounwind optsize readnone }
>>>> +attributes #1 = { nounwind optsize readnone }
>>>> Index: lib/Target/X86/X86InstrAVX512.td
>>>> ===================================================================
>>>> --- lib/Target/X86/X86InstrAVX512.td
>>>> +++ lib/Target/X86/X86InstrAVX512.td
>>>> @@ -6005,7 +6005,7 @@
>>>> 
>>>>   def : Pat<(_.EltVT (OpNode (load addr:$src))),
>>>>             (!cast<Instruction>(NAME#SUFF#Zm)
>>>> -                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[OptForSize]>;
>>>> +                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[HasAVX512, OptForSize]>;
>>>> }
>>>> 
>>>> multiclass avx512_sqrt_scalar_all<bits<8> opc, string OpcodeStr> {
>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160219/4a1723da/attachment.html>


More information about the llvm-commits mailing list