<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 07/02/2018 04:33 PM, Saito, Hideki
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:899F03F2C73A55449C51631866B88749619AA616@FMSMSX110.amr.corp.intel.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style><!--
/* Font Definitions */
@font-face
        {font-family:"MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"\@MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph, li.m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph, div.m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph
        {mso-style-name:m_-9080767634571049055gmail-m_-3365566405396838075m_8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri",sans-serif;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
        <p class="MsoNormal">>It may not be a full solution for the
          problems you're trying to solve<span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">If
            we are inventing a new solution, I’d like it also to solve
            OpenMP declare simd legalization issue. If a small extension
            of existing scheme<o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">works
            for mathlib only, I’m happy to take that and discuss OpenMP
            declare simd issue separately.</span></p>
      </div>
    </blockquote>
    <br>
    I completely agree. We need a solution to handle 'declare simd'
    calls, or to put it another way, arbitrary user-defined functions.
    To me, this really looks like an ABI issue. If we have a function,
    __foo__computeit8(<8 x float> %x), then if our lowering of
    <8 x float> doesn't match the required register assignments,
    then we have the wrong ABI. Will <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47188">https://reviews.llvm.org/D47188</a> fix
    this?<br>
    <br>
     -Hal<br>
    <br>
    <blockquote type="cite"
cite="mid:899F03F2C73A55449C51631866B88749619AA616@FMSMSX110.amr.corp.intel.com">
      <div class="WordSection1">
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
        <p class="MsoNormal">>Or is there some reason that the
          vectorizer needs to be aware of those libcalls?<o:p></o:p></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">I’m
            a strong believer of CodeGen mapping (scalar and widened)
            mathlib calls to actual library (or inlined sequence).<o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">So,
            that question needs to be answered by someone else.<o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Adding
            Michael and Hal.<o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
        <p class="MsoNormal"><a name="_MailEndCompose"
            moz-do-not-send="true"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></a></p>
        <p class="MsoNormal"><a name="_____replyseparator"
            moz-do-not-send="true"></a><b><span
              style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">
            Sanjay Patel [<a class="moz-txt-link-freetext" href="mailto:spatel@rotateright.com">mailto:spatel@rotateright.com</a>]
            <br>
            <b>Sent:</b> Monday, July 02, 2018 11:49 AM<br>
            <b>To:</b> Saito, Hideki <a class="moz-txt-link-rfc2396E" href="mailto:hideki.saito@intel.com"><hideki.saito@intel.com></a><br>
            <b>Cc:</b> Venkataramanan Kumar
            <a class="moz-txt-link-rfc2396E" href="mailto:venkataramanan.kumar.llvm@gmail.com"><venkataramanan.kumar.llvm@gmail.com></a>;
            <a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>; Masten, Matt
            <a class="moz-txt-link-rfc2396E" href="mailto:matt.masten@intel.com"><matt.masten@intel.com></a>; <a class="moz-txt-link-abbreviated" href="mailto:dccitaliano@gmail.com">dccitaliano@gmail.com</a><br>
            <b>Subject:</b> Re: [llvm-dev] [RFC][VECLIB] how should we
            legalize VECLIB calls?<o:p></o:p></span></p>
        <p class="MsoNormal"><o:p> </o:p></p>
        <div>
          <div>
            <p class="MsoNormal">It may not be a full solution for the
              problems you're trying to solve, but I don't know why
              adding to include/llvm/CodeGen/RuntimeLibcalls.def is a
              problem in itself. Certainly, it's a mess that could be
              organized, especially so we're not repeating everything
              for each data type as we do right now.<o:p></o:p></p>
          </div>
          <div>
            <p class="MsoNormal"><o:p> </o:p></p>
          </div>
          <div>
            <p class="MsoNormal">So yes, I think that would allow us to
              remove the VecLib mappings because we are always waiting
              until codegen to make the translation from generic IR to
              target-specific libcall. Or is there some reason that the
              vectorizer needs to be aware of those libcalls?<o:p></o:p></p>
          </div>
          <div>
            <div>
              <p class="MsoNormal"><o:p> </o:p></p>
              <div>
                <p class="MsoNormal">On Mon, Jul 2, 2018 at 11:52 AM,
                  Saito, Hideki <<a
                    href="mailto:hideki.saito@intel.com" target="_blank"
                    moz-do-not-send="true">hideki.saito@intel.com</a>>
                  wrote:<o:p></o:p></p>
                <blockquote style="border:none;border-left:solid #CCCCCC
                  1.0pt;padding:0in 0in 0in
                  6.0pt;margin-left:4.8pt;margin-right:0in">
                  <div>
                    <div>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Venkat,
                          we did not invent LLVM’s VecLib functionality.
                          The original version of D19544 (<a
                            href="https://reviews.llvm.org/D19544?id=55036"
                            target="_blank" moz-do-not-send="true">https://reviews.llvm.org/D19544?id=55036</a>)
                          was indeed a separate pass to convert widened
                          math lib to SVML.</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Our
                          preference for “vectorized sin()” is just
                          widened sin(), that is to be lowered to a
                          specific library call at a later point (either
                          as IR to IR or in CodeGen). Matt tried to sell
                          that idea and it didn’t go through.</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Anyone
                          else willing to work with us to try it again?
                          In my opinion, however, this is a related but
                          different topic from legalization issue.</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Sanjay,
                          I think what you are suggesting would work
                          better if we don’t map math lib calls to
                          VecLib. Otherwise, we’ll have too many
                          RTLIB:VECLIB_ enums, one from each different
                          math function multiplied by each vectorization
                          factor --- for each different VecLib. That’s
                          way too many. If that’s one per different math
                          functions, I’d guess it’s 100+. Still a lot
                          but manageable. This requires those functions
                          to be listed in the intrinsics, right? That’s
                          another reason some people favor VecLib
                          mapping at vectorizer. Those math functions
                          don’t have to be added to the intrinsics.</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">I
                          don’t insist on IR to IR legalization.
                          However, I’m also interested in being able to
                          legalize OpenMP declare simd function calls
                          (**). These are user functions and as such we
                          have no ways to list them as intrinsics or
                          have RTLIB: enums predefined. For each Target,
                          vector function ABI defines how the parameters
                          need to be passed and Legalizer should be
                          implemented based on the ABI, w/o knowing the
                          details of what the user function does. Math
                          lib only solution doesn’t help legalization of
                          OpenMP declare simd.</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Thanks,</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Hideki</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">--------------------------------</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">(**)</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">#pragma
                          omp declare simd uniform(a), linear(i)</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">void
                          foo(float *a, int i);</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">…</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">#pragma
                          omp simd</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">for(i)
                          {                   // this loop could be
                          vectorized with VF that’s wider than widest
                          available vector function for foo().<br>
                              …<br>
                              foo(a, i)<br>
                              …</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">}</span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
                      <p class="MsoNormal"
                        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">
                          Venkataramanan Kumar [mailto:<a
                            href="mailto:venkataramanan.kumar.llvm@gmail.com"
                            target="_blank" moz-do-not-send="true">venkataramanan.kumar.llvm@gmail.com</a>]
                          <br>
                          <b>Sent:</b> Sunday, July 01, 2018 11:38 PM<br>
                          <b>To:</b> Sanjay Patel <<a
                            href="mailto:spatel@rotateright.com"
                            target="_blank" moz-do-not-send="true">spatel@rotateright.com</a>><br>
                          <b>Cc:</b> Saito, Hideki <<a
                            href="mailto:hideki.saito@intel.com"
                            target="_blank" moz-do-not-send="true">hideki.saito@intel.com</a>>;
                          <a href="mailto:llvm-dev@lists.llvm.org"
                            target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>;
                          Masten, Matt <<a
                            href="mailto:matt.masten@intel.com"
                            target="_blank" moz-do-not-send="true">matt.masten@intel.com</a>>;
                          <a href="mailto:dccitaliano@gmail.com"
                            target="_blank" moz-do-not-send="true">dccitaliano@gmail.com</a><br>
                          <b>Subject:</b> Re: [llvm-dev] [RFC][VECLIB]
                          how should we legalize VECLIB calls?</span><o:p></o:p></p>
                      <div>
                        <div>
                          <p class="MsoNormal"
                            style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                          <div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Adding
                                to Ashutosh's comments,  We are also
                                interested in making LLVM generate
                                vector math library calls that are
                                available with glibc (version >
                                2.22).<o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">reference:
                                <a
                                  href="https://sourceware.org/glibc/wiki/libmvec"
                                  target="_blank" moz-do-not-send="true">https://sourceware.org/glibc/wiki/libmvec</a><o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Using
                                the example case given in the reference,
                                we found there are  2 vector versions
                                for "sin" (4 X double) with same VF
                                namely _ZGVcN4v_sin (avx) version and
                                _ZGVdN4v_sin (avx2) versions.  Following
                                the SVML path adding new entry in
                                VecDesc structure in
                                TargetLibraryInfo.cpp,  we can generate
                                the vector version.<o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">But
                                unable to decide which version to expand
                                in the vectorizer. We needed the  TTI
                                information (ISA ).  It looks like
                                better to legalize or generate them
                                later.<o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">regards,<o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Venkat.<o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"
                                style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                            </div>
                            <div>
                              <div>
                                <p class="MsoNormal"
                                  style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                <div>
                                  <p class="MsoNormal"
                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">On
                                    30 June 2018 at 04:04, Sanjay Patel
                                    via llvm-dev <<a
                                      href="mailto:llvm-dev@lists.llvm.org"
                                      target="_blank"
                                      moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
                                    wrote:<o:p></o:p></p>
                                  <blockquote
                                    style="border:none;border-left:solid
                                    windowtext 1.0pt;padding:0in 0in 0in
6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt;border-color:currentcolor
                                    currentcolor currentcolor
                                    rgb(204,204,204)">
                                    <div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Hi
                                          Hideki -<o:p></o:p></p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">I
                                          hinted at this problem in the
                                          summary text of
                                          <a
                                            href="https://reviews.llvm.org/D47610"
                                            target="_blank"
                                            moz-do-not-send="true">https://reviews.llvm.org/D47610</a>:<o:p></o:p></p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Why
                                          are we transforming from LLVM
                                          intrinsics to
                                          platform-specific intrinsics
                                          in IR? I don't see the
                                          benefit.<o:p></o:p></p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">I
                                          don't know if it solves all of
                                          the problems you're seeing,
                                          but it should be a small
                                          change to transform to the
                                          platform-specific SVML or
                                          other intrinsics in the DAG.
                                          We already do this for mathlib
                                          calls on Linux for example
                                          when we can use the finite
                                          versions of the calls. Have a
                                          look in
                                          SelectionDAGLegalize::ConvertNodeToLibcall():<o:p></o:p></p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">   
                                          if (CanUseFiniteLibCall
                                          &&
                                          DAG.getLibInfo().has(LibFunc_log_finite))<br>
                                               
                                          Results.push_back(ExpandFPLibCall(Node,
                                          RTLIB::LOG_FINITE_F32,<br>
                                        RTLIB::LOG_FINITE_F64,<br>
                                        RTLIB::LOG_FINITE_F80,<br>
                                        RTLIB::LOG_FINITE_F128,<br>
                                        RTLIB::LOG_FINITE_PPCF128));<br>
                                              else<br>
                                               
                                          Results.push_back(ExpandFPLibCall(Node,
                                          RTLIB::LOG_F32,
                                          RTLIB::LOG_F64,<br>
                                        RTLIB::LOG_F80, RTLIB::LOG_F128,<br>
                                        RTLIB::LOG_PPCF128));<o:p></o:p></p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"
                                          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                      </div>
                                    </div>
                                    <div>
                                      <div>
                                        <div>
                                          <p class="MsoNormal"
                                            style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                          <div>
                                            <p class="MsoNormal"
                                              style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">On
                                              Fri, Jun 29, 2018 at 2:15
                                              PM, Saito, Hideki <<a
                                                href="mailto:hideki.saito@intel.com"
                                                target="_blank"
                                                moz-do-not-send="true">hideki.saito@intel.com</a>>
                                              wrote:<o:p></o:p></p>
                                            <blockquote
                                              style="border:none;border-left:solid
                                              windowtext
                                              1.0pt;padding:0in 0in 0in
6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt;border-color:currentcolor
                                              currentcolor currentcolor
                                              rgb(204,204,204)">
                                              <div>
                                                <div>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D"> </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">Ashutosh,</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D"> </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">Thanks for the repy.</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D"> </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">Related earlier topic on this appears in the
                                                      review of the SVML
                                                      patch (@mmasten).
                                                      Adding few names
                                                      from there.</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:.5in"><span
style="color:#1F497D"><a href="https://reviews.llvm.org/D19544"
                                                        target="_blank"
moz-do-not-send="true">https://reviews.llvm.org/D19544</a></span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">There, I see Hal’s review comment “let’s start
                                                      only with the
                                                      directly-legal
                                                      calls”.
                                                      Apparently, what
                                                      we have right now</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">in the trunk is “not legal enough”. I’ll work on
                                                      the patch to stop
                                                      bleeding while we
                                                      continue to
                                                      discuss
                                                      legalization
                                                      topic.</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D"> </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">I suppose</span><o:p></o:p></p>
                                                  <p
class="m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph"><span
style="color:#1F497D">1)</span><span
                                                      style="font-size:7.0pt;color:#1F497D">     
                                                    </span><span
                                                      style="color:#1F497D">LV
                                                      only solution (let
                                                      LV emit already
                                                      legalized VECLIB
                                                      calls) is
                                                      certainly not
                                                      scalable. It won’t
                                                      help if VECLIB
                                                      calls<br>
                                                      are generated
                                                      elsewhere. Also,
                                                      keeping VF low
                                                      enough to prevent
                                                      the legalization
                                                      problem is only a
                                                      workaround,<br>
                                                      not a solution.</span><o:p></o:p></p>
                                                  <p
class="m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph"><span
style="color:#1F497D">2)</span><span
                                                      style="font-size:7.0pt;color:#1F497D">     
                                                    </span><span
                                                      style="color:#1F497D">Assuming
                                                      that we have to go
                                                      to IR to IR pass
                                                      route, there are 3
                                                      ways to think:</span><o:p></o:p></p>
                                                  <p
class="m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph"
style="margin-left:1.0in">
                                                    <span
                                                      style="color:#1F497D">a.</span><span
style="font-size:7.0pt;color:#1F497D">      
                                                    </span><span
                                                      style="color:#1F497D">Go
                                                      with very generic
                                                      IR to IR
                                                      legalization pass
                                                      comparable to ISD
                                                      level
                                                      legalization. This
                                                      is most general<br>
                                                      but I’d think this
                                                      is the highest
                                                      cost for
                                                      development.</span><o:p></o:p></p>
                                                  <p
class="m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph"
style="margin-left:1.0in">
                                                    <span
                                                      style="color:#1F497D">b.</span><span
style="font-size:7.0pt;color:#1F497D">     
                                                    </span><span
                                                      style="color:#1F497D">Go
                                                      with
                                                      Intrinsic-only
                                                      legalization and
                                                      then apply VECLIB
                                                      afterwards. This
                                                      requires all
                                                      scalar functions<br>
                                                      with VECLIB
                                                      mapping to be
                                                      added to
                                                      intrinsic. </span><o:p></o:p></p>
                                                  <p
class="m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph"
style="margin-left:1.0in">
                                                    <span
                                                      style="color:#1F497D">c.</span><span
style="font-size:7.0pt;color:#1F497D">      
                                                    </span><span
                                                      style="color:#1F497D">Go
                                                      with generic
                                                      enough function
                                                      call legalization,
                                                      with the ability
                                                      to add custom
                                                      legalization for
                                                      each VECLIB<br>
                                                      (and if needed
                                                      each VECLIB or
                                                      non-VECLIB entry).</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D"> </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">I think the cost of 2.b) and 2.c) are similar and
                                                      2.c) seems to be
                                                      more flexible. So,
                                                      I guess we don’t
                                                      really have to tie
                                                      this</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">discussion with “letting LV emit widened math call
                                                      instead of
                                                      VECLIB”, even
                                                      though I strongly
                                                      favor that than LV
                                                      emitting</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">VECLIB calls.</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D"> </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">@Davide, in D19544, @spatel thought
                                                      LibCallSimplifier
                                                      has relevance to
                                                      this legalization
                                                      topic. Do you know
                                                      enough about</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">LibCallSimiplifer to tell whether it can be
                                                      extended to deal
                                                      with 2.b) or 2.c)?
                                                    </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D"> </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">If we think 2.b)/2.c) are right enough directions,
                                                      I can clean up
                                                      what we have and
                                                      upload it to
                                                      Phabricator as a
                                                      starting point</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">to get to 2.b)/2.c).
                                                    </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D"> </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">Continue waiting for more feedback. I guess I
                                                      shouldn’t expect a
                                                      lot this week and
                                                      next due to the
                                                      big holiday in the
                                                      U.S.</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D"> </span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">Thanks,</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D">Hideki</span><o:p></o:p></p>
                                                  <p class="MsoNormal"
                                                    style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="color:#1F497D"> </span><o:p></o:p></p>
                                                  <div>
                                                    <div
                                                      style="border:none;border-top:solid
                                                      windowtext
                                                      1.0pt;padding:3.0pt
                                                      0in 0in
                                                      0in;border-color:currentcolor
                                                      currentcolor">
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><a
                                                          name="m_-9080767634571049055_m_-33655664053968"
moz-do-not-send="true"></a><b>From:</b> Nema, Ashutosh [mailto:<a
                                                          href="mailto:Ashutosh.Nema@amd.com"
target="_blank" moz-do-not-send="true">Ashutosh.Nema@amd.com</a>]
                                                        <br>
                                                        <b>Sent:</b>
                                                        Thursday, June
                                                        28, 2018 11:37
                                                        PM<br>
                                                        <b>To:</b>
                                                        Saito, Hideki
                                                        <<a
                                                          href="mailto:hideki.saito@intel.com"
target="_blank" moz-do-not-send="true">hideki.saito@intel.com</a>><br>
                                                        <b>Cc:</b> <a
                                                          href="mailto:llvm-dev@lists.llvm.org"
target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
                                                        <b>Subject:</b>
                                                        RE:
                                                        [RFC][VECLIB]
                                                        how should we
                                                        legalize VECLIB
                                                        calls?<o:p></o:p></p>
                                                    </div>
                                                  </div>
                                                  <div>
                                                    <div>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Hi Saito,<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">At AMD we
                                                        have our own
                                                        version of
                                                        vector library
                                                        and faced
                                                        similar
                                                        problems, we
                                                        followed the
                                                        SVML path and
                                                        from vectorizer
                                                        generated the
                                                        respective
                                                        vector calls.
                                                        When vectorizer
                                                        generates the
                                                        respective calls
                                                        i.e __svml_sin_4
                                                        or
                                                        __amdlibm_sin_4,
                                                        later one can
                                                        perform only
                                                        string matching
                                                        to identify the
                                                        vector lib call.
                                                        I’m not sure
                                                        it’s the proper
                                                        way, may be
                                                        instead of
                                                        generating
                                                        respective calls
                                                        it’s better to
                                                        generate some
                                                        standard call
                                                        (may be
                                                        intrinsics) and
                                                        lower it later.
                                                        A late IR pass
                                                        can be
                                                        introduced to
                                                        perform
                                                        lowering, this
                                                        will lower the
                                                        intrinsic calls
                                                        to specific lib
calls(__svml_sin_4 or __amdlibm_sin_4 or … ). This can be table driven
                                                        to decide the
                                                        action based on
                                                        the vector
                                                        library,
                                                        function name,
                                                        VF and target
                                                        information, the
                                                        action can be
                                                        full-serialize,
partial-serialize(VF8 to 2 VF4) or generate the lib call with same VF.<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Thanks,<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Ashutosh<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                                      <div>
                                                        <div
                                                          style="border:none;border-top:solid
                                                          windowtext
                                                          1.0pt;padding:3.0pt
                                                          0in 0in
                                                          0in;border-color:currentcolor
                                                          currentcolor">
                                                          <p
                                                          class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                          <b>From:</b>
                                                          llvm-dev [<a
                                                          href="mailto:llvm-dev-bounces@lists.llvm.org"
target="_blank" moz-do-not-send="true">mailto:llvm-dev-bounces@lists.llvm.org</a>]
                                                          <b>On Behalf
                                                          Of </b>Saito,
                                                          Hideki via
                                                          llvm-dev<br>
                                                          <b>Sent:</b>
                                                          Friday, June
                                                          29, 2018 7:41
                                                          AM<br>
                                                          <b>To:</b>
                                                          'Saito, Hideki
                                                          via llvm-dev'
                                                          <<a
                                                          href="mailto:llvm-dev@lists.llvm.org"
target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>><br>
                                                          <b>Subject:</b>
                                                          [llvm-dev]
                                                          [RFC][VECLIB]
                                                          how should we
                                                          legalize
                                                          VECLIB calls?<o:p></o:p></p>
                                                        </div>
                                                      </div>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Illustrative
                                                        Example:<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        clang
                                                        -fveclib=SVML
                                                        -O3 svml.c -mavx<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        #include
                                                        <math.h><o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        void foo(double
                                                        *a, int N){<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                          int i;<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        #pragma clang
                                                        loop
                                                        vectorize_width(8)<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                          for
                                                        (i=0;i<N;i++){<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                            a[i] =
                                                        sin(i);<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                          }<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        }<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Currently, this
                                                        results in a
                                                        call to <8 x
                                                        double>
                                                        __svml_sin8(<8
                                                        x double>)
                                                        after the
                                                        vectorizer.<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        This is
                                                        8-element SVML
                                                        sin() called
                                                        with 8-element
                                                        argument. On the
                                                        surface, this
                                                        looks very good.<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Later on,
                                                        standard vector
                                                        type
                                                        legalization
                                                        kicks-in but
                                                        only the
                                                        argument and
                                                        return data are
                                                        legalized.<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                                vmovaps
                                                        %ymm0, %ymm1<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                               
                                                        vcvtdq2pd      
                                                        %xmm1, %ymm0<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                               
                                                        vextractf128   
                                                        $1, %ymm1, %xmm1<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                               
                                                        vcvtdq2pd      
                                                        %xmm1, %ymm1<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                                callq  
                                                        __svml_sin8<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                                vmovups
                                                        %ymm1,
                                                        32(%r15,%r12,8)<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                                vmovups
                                                        %ymm0,
                                                        (%r15,%r12,8)<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Unfortunately,
                                                        __svml_sin8()
                                                        doesn’t use this
                                                        form of
                                                        input/output. It
                                                        takes zmm0 and
                                                        returns zmm0.<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        i.e., not legal
                                                        to use for AVX.<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        What we need to
                                                        see instead is
                                                        two calls to
                                                        __svml_sin4(),
                                                        like below.<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                                vmovaps
                                                        %ymm0, %ymm1<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                               
                                                        vcvtdq2pd      
                                                        %xmm1, %ymm0<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                               
                                                        vextractf128   
                                                        $1, %ymm1, %xmm1<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                               
                                                        vcvtdq2pd      
                                                        %xmm1, %ymm1<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                                callq  
                                                        __svml_sin4<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                                vmovups
                                                        %ymm0,
                                                        32(%r15,%r12,8)<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                                vmovups
                                                        %ymm1, ymm0<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                                callq  
                                                        __svml_sin4<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                                vmovups
                                                        %ymm0,
                                                        (%r15,%r12,8)<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        What would be
                                                        the most
                                                        acceptable way
                                                        to make this
                                                        happen? Anybody
                                                        having had a
                                                        similar need
                                                        previously?<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Easiest
                                                        workaround is to
                                                        serialize the
                                                        call above “type
                                                        legal”
                                                        vectorization
                                                        factor. This can
                                                        be done with a
                                                        few lines of
                                                        code,<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        plus the code to
                                                        recognize that
                                                        the call is
                                                        “SVML” (which is
                                                        currently string
                                                        match against
                                                        “__svml” prefix
                                                        in my local
                                                        workspace).<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        If higher VF is
                                                        not forced, cost
                                                        model will
                                                        likely favor
                                                        lower VF.
                                                        Functionally
                                                        correct, but
                                                        obviously not an
                                                        ideal solution.<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Here are a few
                                                        ideas I thought
                                                        about:<o:p></o:p></p>
                                                      <p
class="m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph"
style="margin-left:1.0in">
                                                        1)<span
                                                          style="font-size:7.0pt">     
                                                        </span>Standard
LegalizeVectorType() in CodeGen/SelectionDAG doesn’t seem to work. We
                                                        could define a
                                                        generic
                                                        ISD::VECLIB<br>
                                                        and try to split
                                                        into two or more
                                                        VECLIB nodes,
                                                        but at that
                                                        moment we lost
                                                        the information
                                                        about which
                                                        function to
                                                        call.<br>
                                                        We can’t define
                                                        ISD opcode per
                                                        function. There
                                                        will be too many
                                                        libm entries to
                                                        deal with. We
                                                        need a scalable
                                                        solution.<o:p></o:p></p>
                                                      <p
class="m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph"
style="margin-left:1.0in">
                                                        2)<span
                                                          style="font-size:7.0pt">     
                                                        </span>We could
                                                        write an IR to
                                                        IR pass to
                                                        perform IR level
                                                        legalization.
                                                        This is
                                                        essentially
                                                        duplicating the
                                                        functionality of
LegalizeVectorType()<br>
                                                        but we can make
                                                        this available
                                                        for other
                                                        similar things
                                                        that can’t use
                                                        ISD level vector
                                                        type
                                                        legalization.
                                                        This looks to be
                                                        attractive
                                                        enough<br>
                                                        from that
                                                        perspective.<o:p></o:p></p>
                                                      <p
class="m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph"
style="margin-left:1.0in">
                                                        3)<span
                                                          style="font-size:7.0pt">     
                                                        </span>We have
                                                        implemented
                                                        something
                                                        similar to 2),
                                                        but legalization
                                                        code is
                                                        specialized for
                                                        SVML
                                                        legalization.
                                                        This was much
                                                        quicker than<br>
                                                        trying to
                                                        generalize the
                                                        legalization
                                                        scheme, but I’d
                                                        imagine
                                                        community won’t
                                                        like it.<o:p></o:p></p>
                                                      <p
class="m-9080767634571049055gmail-m-3365566405396838075m8049699665263122126gmail-m-2140871165585084385m-5799122468671666334msolistparagraph"
style="margin-left:1.0in">
                                                        4)<span
                                                          style="font-size:7.0pt">     
                                                        </span>Vectorizer
                                                        emit legalized
                                                        VECLIB calls.
                                                        Since it can
                                                        emit
                                                        instructions in
                                                        scalarized form,
                                                        adding legalized
                                                        call
                                                        functionality is
                                                        in some sense<br>
                                                        similar to that.
                                                        Vectorizer can’t
                                                        simply choose
                                                        type legal
                                                        function name
                                                        with illegal
                                                        vector ----
                                                        since
                                                        LegalizeVectorType()
                                                        will still<br>
                                                        end up using one
                                                        call instead of
                                                        two. <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Anything else?<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Also, doing any
                                                        of this requires
                                                        reverse mapping
                                                        from VECLIB name
                                                        to scalar
                                                        function name.
                                                        What’s the most
                                                        recommended way
                                                        to do so?<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Can we use
                                                        TableGen to
                                                        create a reverse
                                                        map?<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Your input is
                                                        greatly
                                                        appreciated. Is
                                                        there a real
                                                        need/desire for
                                                        2) outside of
                                                        VECLIB (or
                                                        outside of
                                                        SVML)?<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Thanks,<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Hideki Saito<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                        Intel
                                                        Corporation<o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                      <p
                                                        class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:.5in">
                                                         <o:p></o:p></p>
                                                    </div>
                                                  </div>
                                                </div>
                                              </div>
                                            </blockquote>
                                          </div>
                                          <p class="MsoNormal"
                                            style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                                        </div>
                                      </div>
                                    </div>
                                    <p class="MsoNormal"
                                      style="mso-margin-top-alt:auto;margin-bottom:12.0pt"><br>
_______________________________________________<br>
                                      LLVM Developers mailing list<br>
                                      <a
                                        href="mailto:llvm-dev@lists.llvm.org"
                                        target="_blank"
                                        moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
                                      <a
                                        href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                                        target="_blank"
                                        moz-do-not-send="true">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><o:p></o:p></p>
                                  </blockquote>
                                </div>
                                <p class="MsoNormal"
                                  style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </blockquote>
              </div>
              <p class="MsoNormal"><o:p> </o:p></p>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
  </body>
</html>