<div dir="ltr">Hi,<br><div><div class="gmail_extra"><br><div class="gmail_quote">On 4 July 2018 at 07:42, Nema, Ashutosh via llvm-dev <span dir="ltr"><<a target="_blank" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">+ llvm-dev<br>

<br>

-----Original Message-----<br>

From: Nema, Ashutosh <br>

Sent: Wednesday, July 4, 2018 12:12 PM<br>

To: Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>>; Saito, Hideki <<a href="mailto:hideki.saito@intel.com">hideki.saito@intel.com</a>>; Sanjay Patel <<a href="mailto:spatel@rotateright.com">spatel@rotateright.com</a>>; <a href="mailto:mzolotukhin@apple.com">mzolotukhin@apple.com</a><br>

Cc: <a href="mailto:dccitaliano@gmail.com">dccitaliano@gmail.com</a>; Masten, Matt <<a href="mailto:matt.masten@intel.com">matt.masten@intel.com</a>><br>

Subject: RE: [llvm-dev] [RFC][VECLIB] how should we legalize VECLIB calls?<br>

<br>

Hi Hal,<br>

<span class="gmail-"><br>

> __svml_sin8 (plus whatever shuffles are necessary). <br>

> The vectorizer should do this.<br>

> It should not generate calls to functions that don't exist.<br>

<br>

</span>I'm not sure how vectorizer will do this, consider the case where "-vectorizer-maximize-<wbr>bandwidth" option is enabled and vectorizer is forced to generate the wider VF, and hence it may generate a call to __svml_sin_* which may not exist. <br>

<br>

Are you expecting the vectorizer to lower the calls i.e. __svml_sin_8 to two __svml_sin_4 calls ?<br>

<br>

Regards,<br>

Ashutosh<br></blockquote><br></div><div class="gmail_quote">If an accurate cost model was in place (which there isn't), then an "unsupported" vectorization factor should only be selected if it was forced.  However, in this case __svml_sin_8 is the same cost as __svml_sin_4, so the loop vectorizer will select a VF of 8, and generate a call to a function which effectively doesn't exist.<br><br></div><div class="gmail_quote">The simplest way to fix it, is to simply only populate the SVML vector library table with __svml_sin_8 when the target is AVX-512.  Alternatively, TLI.isFunctionVectorizable() should check that the entry is available on the target (this is more difficult as the type is not encoded).<br><br></div><div class="gmail_quote">I'm guessing that the cost model would then make VF=4 cheaper, so generating calls to __svml_sin_4 (I'm not in work so can't check).   If the vectorization factor was forced to 8, we'll either get a call to <span class="gmail-s">the intrinsic llvm.sin.v8f64 (if no-math-errno) or the vectorizer will scalarize the call.  The vectorizer would not generate two calls to __svml_sin_4 although this would be cheaper.<br></span></div><div class="gmail_quote"><br></div><div class="gmail_quote">While this problem probably doesn't require the loop vectorizer to have knowledge of the target ABI, others may do.  I'm thinking specifically of D48193:<br><br><a href="https://reviews.llvm.org/D48193">https://reviews.llvm.org/D48193</a><br><br></div><div class="gmail_quote">In this case we have poor code generation due to the interleave count selected by the loop vectorizer.  I can't see how this can be fixed later, so we will need to expose details of the ABI to the loop vectorizer (see my latest comment D48193#1149705).<br><br></div><div class="gmail_quote">Thanks,<br></div><div class="gmail_quote">Rob.<br><br></div><div class="gmail_quote"><br></div></div></div></div>