[PATCH] D17414: Fix incorrect selection of AVX512 sqrt when OptForSize is on

Mehdi Amini via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 19 10:19:08 PST 2016


So from what I observed (I may be wrong), the other pattern are inlined like that:

let Predicates = [HasAVX512] in {
  def : Pat<(f32 (X86frsqrt (load addr:$src))),
            (COPY_TO_REGCLASS (VRSQRT14SSrm (v4f32 (IMPLICIT_DEF)), addr:$src), VR128X)>,
            Requires<[OptForSize]>;

The inner Requires<[OptForSize] is *ignored*. This is why I consider it as a misoptimization and not a correctness bug. They need to be fixed, but it should be a separate patch (and test case).


The bug that this patch is fixing is different, there was *no* enclosing scope with a predicate when declaring:

defm VSQRT   : avx512_sqrt_scalar_all<0x51, "vsqrt">, VEX_LIG;

It refers to a multiclass "avx512_sqrt_scalar_all" which itself refers to "avx512_sqrt_scalar", which contains this pattern:

  def : Pat<(_.EltVT (OpNode (load addr:$src))),
            (!cast<Instruction>(NAME#SUFF#Zm)
                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[OptForSize]>;


Here the OptForSize predicate is taken into account, but there is no other restriction.
It is only an issue for this pattern, the instruction themselves in this multiclass are ultimately defined by the "class AVX512" which has "Requires<[HasAVX512]>".

Now it seems to me that a pattern that tries to generate an instruction should "inherit" the restriction of this instruction, that would prevent such bugs.

Hope it makes sense.

-- 
Mehdi





> On Feb 19, 2016, at 10:06 AM, Jim Grosbach <grosbach at apple.com> wrote:
> 
> If they’re overwriting the Predicates like this one was, they could have the same problem.
> 
>> On Feb 19, 2016, at 9:09 AM, Roman Divacky <rdivacky at vlakno.cz <mailto:rdivacky at vlakno.cz>> wrote:
>> 
>> What about the other cases of Requires[OptForSize] in the same file?
>> 
>> On Thu, Feb 18, 2016 at 10:11:59PM +0000, Dimitry Andric via llvm-commits wrote:
>>> dim created this revision.
>>> dim added reviewers: joker.eph, grosbach, resistor.
>>> dim added a subscriber: llvm-commits.
>>> 
>>> When optimizing for size, sqrt calls can be incorrectly selected as
>>> AVX512 VSQRT instructions.  This is because X86InstrAVX512.td has a
>>> `Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass
>>> definition.  Even if the target does not support AVX512, the class can
>>> apparently still be chosen, leading to an incorrect selection of
>>> `vsqrtss`.
>>> 
>>> In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <=
>>> X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction
>>> requires an XMM register, which is not available on i686 CPUs.
>>> 
>>> http://reviews.llvm.org/D17414 <http://reviews.llvm.org/D17414>
>>> 
>>> Files:
>>>  lib/Target/X86/X86InstrAVX512.td
>>>  test/CodeGen/X86/pr26625.ll
>>> 
>>> Index: test/CodeGen/X86/pr26625.ll
>>> ===================================================================
>>> --- /dev/null
>>> +++ test/CodeGen/X86/pr26625.ll
>>> @@ -0,0 +1,20 @@
>>> +; RUN: llc < %s -mcpu=i686 2>&1 | FileCheck %s
>>> +; PR26625
>>> +
>>> +target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
>>> +target triple = "i386"
>>> +
>>> +define float @x0(float %f) #0 {
>>> +entry:
>>> +  %call = tail call float @sqrtf(float %f) #1
>>> +  ret float %call
>>> +; CHECK-LABEL: x0:
>>> +; CHECK: flds
>>> +; CHECK-NEXT: fsqrt
>>> +; CHECK-NOT: vsqrtss
>>> +}
>>> +
>>> +declare float @sqrtf(float) #0
>>> +
>>> +attributes #0 = { nounwind optsize readnone }
>>> +attributes #1 = { nounwind optsize readnone }
>>> Index: lib/Target/X86/X86InstrAVX512.td
>>> ===================================================================
>>> --- lib/Target/X86/X86InstrAVX512.td
>>> +++ lib/Target/X86/X86InstrAVX512.td
>>> @@ -6005,7 +6005,7 @@
>>> 
>>>   def : Pat<(_.EltVT (OpNode (load addr:$src))),
>>>             (!cast<Instruction>(NAME#SUFF#Zm)
>>> -                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[OptForSize]>;
>>> +                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[HasAVX512, OptForSize]>;
>>> }
>>> 
>>> multiclass avx512_sqrt_scalar_all<bits<8> opc, string OpcodeStr> {
>>> 
>>> 
>> 
>>> Index: test/CodeGen/X86/pr26625.ll
>>> ===================================================================
>>> --- /dev/null
>>> +++ test/CodeGen/X86/pr26625.ll
>>> @@ -0,0 +1,20 @@
>>> +; RUN: llc < %s -mcpu=i686 2>&1 | FileCheck %s
>>> +; PR26625
>>> +
>>> +target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
>>> +target triple = "i386"
>>> +
>>> +define float @x0(float %f) #0 {
>>> +entry:
>>> +  %call = tail call float @sqrtf(float %f) #1
>>> +  ret float %call
>>> +; CHECK-LABEL: x0:
>>> +; CHECK: flds
>>> +; CHECK-NEXT: fsqrt
>>> +; CHECK-NOT: vsqrtss
>>> +}
>>> +
>>> +declare float @sqrtf(float) #0
>>> +
>>> +attributes #0 = { nounwind optsize readnone }
>>> +attributes #1 = { nounwind optsize readnone }
>>> Index: lib/Target/X86/X86InstrAVX512.td
>>> ===================================================================
>>> --- lib/Target/X86/X86InstrAVX512.td
>>> +++ lib/Target/X86/X86InstrAVX512.td
>>> @@ -6005,7 +6005,7 @@
>>> 
>>>   def : Pat<(_.EltVT (OpNode (load addr:$src))),
>>>             (!cast<Instruction>(NAME#SUFF#Zm)
>>> -                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[OptForSize]>;
>>> +                (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[HasAVX512, OptForSize]>;
>>> }
>>> 
>>> multiclass avx512_sqrt_scalar_all<bits<8> opc, string OpcodeStr> {
>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160219/7f164fb6/attachment.html>


More information about the llvm-commits mailing list