<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html;

      charset=windows-1252">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">On 4/26/2018 6:51 AM, Alexandros

      Lamprineas via llvm-dev wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:DB6PR0802MB2424200BAFEBF790516A1620E28E0@DB6PR0802MB2424.eurprd08.prod.outlook.com">

      <meta http-equiv="Content-Type" content="text/html;

        charset=windows-1252">

      <style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>

      <div id="divtagdefaultwrapper" style="font-size: 12pt; color:

        rgb(0, 0, 0); font-family:

        Calibri,Helvetica,sans-serif,"EmojiFont","Apple

        Color Emoji","Segoe UI

        Emoji",NotoColorEmoji,"Segoe UI

        Symbol","Android Emoji",EmojiSymbols;" dir="ltr">

        <p style="margin-top:0;margin-bottom:0">Hello,</p>

        <p style="margin-top:0;margin-bottom:0"><br>

        </p>

        <p style="margin-top:0;margin-bottom:0">There is a particular

          code sequence I would like to optimize at the IR level.</p>

        <p style="margin-top:0;margin-bottom:0">I'd like to turn an

          Arm/AArch64 table lookup intrinsic that takes a constant

          vector mask into a shufflevector instruction:</p>

        <p style="margin-top:0;margin-bottom:0">vtbl1(V,mask) ~>

          shufflevector(V,undef,mask)</p>

        <p style="margin-top:0;margin-bottom:0"><br>

        </p>

        <p style="margin-top:0;margin-bottom:0">The reason is that if

          the mask is {7,6,5,4,3,2,1,0}, then the backend will generate

          rev64 instructions instead.</p>

        <p style="margin-top:0;margin-bottom:0">If the mask comes from a

          vld1 of a global constant I could fold it to allow the above

          instruction combining.</p>

        <p style="margin-top:0;margin-bottom:0">My question is, does the

          constant folding of the vld1 seem a good thing to do in the

          general case, as a standalone transformation, or only when

          used as a mask for a table lookup?<br>

        </p>

      </div>

    </blockquote>

    <br>

    Yes, constant-folding vld1 seems like a good idea.<br>

    <br>

    Actually, we should probably just lower the NEON vld1 intrinsics to

    an LLVM "load" (which would give us constant-folding for free), but

    that would be more work to make sure it doesn't have any unexpected

    effects.<br>

    <br>

    -Eli

    <p><br>

    </p>

    <pre class="moz-signature" cols="72">-- 

Employee of Qualcomm Innovation Center, Inc.

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>

  </body>

</html>