<div dir="ltr">Hi Arnold,<div><br></div><div>> <span style="font-family:arial,sans-serif;font-size:13px">First, we are going to have the situation where there exists an intrinsic ID for a library function (many math library functions have an intrinsic version: expf -> llvm.exp.32 for example). As a consequence “getIntrinsicIDForCall” will return it. In this case we can have both: a vectorized library function version and an intrinsic function that maybe slower or faster. In such a case the cost model has to decide which one to pick. This means we have to query the cost model which one is cheaper in two places: when get the instruction cost and when we vectorize the call.</span></div>

<div><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div><div><span style="font-family:arial,sans-serif;font-size:13px">Sure, I will address this.</span></div><div><span style="font-family:arial,sans-serif;font-size:13px"><br>

</span></div><div><font face="arial, sans-serif">> Second, the way we test this. [snip]</font><br></div><div><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div><div class="gmail_extra">This is very sensible. The only reason I didn't go down this route to start with was that I didn't know of an available library (like Accelerate) and didn't want to add testing/dummy code in tree. Thanks for pointing me at Accelerate - that'll give me a real library to (semi) implement and test.</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">> <span style="font-family:arial,sans-serif;font-size:13px">This brings me to issue three. You are currently using TTI->getCallCost() which is not meant to be used with the vectorizer. We should create a getCallInstrCost() function similar to the “getIntrinsicInstrCost” function we already have.</span><br style="font-family:arial,sans-serif;font-size:13px">

> <br style="font-family:arial,sans-serif;font-size:13px"><span style="font-family:arial,sans-serif;font-size:13px">> BasicTTI::getCallInstrCost should query TLI->isFunctionVectorizable() and return a sensible value in this case (one that is lower than a scalarized intrinsic lowered as lib call).</span></div>

<div class="gmail_extra"><font face="arial, sans-serif"><br></font></div><div class="gmail_extra"><font face="arial, sans-serif">I don't understand the difference between getIntrinsicCost and getIntrinsicInstrCost. They both take the same arguments (but return different values), and the doxygen docstring does not describe the action in enough detail to discern what the required behaviour is.</font></div>

<div class="gmail_extra"><font face="arial, sans-serif"><br></font></div><div class="gmail_extra"><font face="arial, sans-serif">Could you please tell me? (and I'll update the docstrings while I'm at it).</font></div>

<div class="gmail_extra"><font face="arial, sans-serif"><br></font></div><div class="gmail_extra"><font face="arial, sans-serif">Cheers,</font></div><div class="gmail_extra"><font face="arial, sans-serif"><br></font></div>

<div class="gmail_extra"><font face="arial, sans-serif">James<br></font><br><div class="gmail_quote">On 16 January 2014 22:51, Arnold Schwaighofer <span dir="ltr"><<a href="mailto:aschwaighofer@apple.com" target="_blank">aschwaighofer@apple.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi James,<br>

<br>

overall I like this patch. Thanks for working on this! There three issues I would like to address:<br>

<br>

First, we are going to have the situation where there exists an intrinsic ID for a library function (many math library functions have an intrinsic version: expf -> llvm.exp.32 for example). As a consequence “getIntrinsicIDForCall” will return it. In this case we can have both: a vectorized library function version and an intrinsic function that maybe slower or faster. In such a case the cost model has to decide which one to pick. This means we have to query the cost model which one is cheaper in two places: when get the instruction cost and when we vectorize the call.<br>


<br>

Second, the way we test this. I understand that we currently don’t have anyone adding vectorize function calls in tree. However, I really would like not to have to use unit tests to test this feature. How about we use the Environment component (4th) to specify the available library. Say, on MacOSX you have the Accelerate library.<br>


<br>

     TLI.setUnavailable(LibFunc::statvfs64);<br>

     TLI.setUnavailable(LibFunc::tmpfile64);<br>

   }<br>

+<br>

+  // Make the vectorized versions available.<br>

+  if (T.getEnvironmentName() == "Accelerate") {<br>

+    const TargetLibraryInfo::VecDesc VecFuncs[] = {<br>

+      { "exp", "vexp", 2},<br>

+      { "expf", "vexpf", 4}<br>

+    };<br>

+    TLI.addVectorizableFunctions(VecFuncs);<br>

+  }<br>

 }<br>

<br>

Then, we can test this feature with<br>

<br>

target triple = "x86_64-apple-macos-Accelerate"<br>

<br>

define void @test(double* %d, double %t) {<br>

  ...<br>

  %1 = tail call double @llvm.exp.f64(double %0)<br>

<br>

<br>

We will also have to assign a lower cost to function calls of “vexp” in the cost model than for the intrinsic version (in this example). This brings me to issue three. You are currently using TTI->getCallCost() which is not meant to be used with the vectorizer. We should create a getCallInstrCost() function similar to the “getIntrinsicInstrCost” function we already have.<br>


<br>

BasicTTI::getCallInstrCost should query TLI->isFunctionVectorizable() and return a sensible value in this case (one that is lower than a scalarized intrinsic lowered as lib call).<br>

<br>

<br>

Thanks,<br>

Arnold<br>

<br>

On Jan 15, 2014, at 11:22 AM, James Molloy <<a href="mailto:james@jamesmolloy.co.uk">james@jamesmolloy.co.uk</a>> wrote:<br>

<br>

> <vectorizer-tli.diff><br>

<br>

</blockquote></div><br></div></div>