<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Hi,<br>
    </p>
    <div class="moz-cite-prefix">On 12/19/18 11:07 PM, Adam Nemet via
      llvm-dev wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:C5ABF5BF-B27C-46FF-B6F5-4CC30922C859@apple.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <br class="">
      <div><br class="">
        <blockquote type="cite" class="">
          <div class="">On Dec 19, 2018, at 1:31 PM, Stephen Canon <<a
              href="mailto:scanon@apple.com" class=""
              moz-do-not-send="true">scanon@apple.com</a>> wrote:</div>
          <br class="Apple-interchange-newline">
          <div class="">
            <meta http-equiv="Content-Type" content="text/html;
              charset=UTF-8" class="">
            <div style="word-wrap: break-word; -webkit-nbsp-mode: space;
              line-break: after-white-space;" class="">
              <div class="">
                <blockquote type="cite" class="">
                  <div class="">On Dec 19, 2018, at 11:09 AM, Stephen
                    Canon via llvm-dev <<a
                      href="mailto:llvm-dev@lists.llvm.org" class=""
                      moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
                    wrote:</div>
                  <br class="Apple-interchange-newline">
                  <div class="">
                    <meta http-equiv="Content-Type" content="text/html;
                      charset=UTF-8" class="">
                    <div style="word-wrap: break-word;
                      -webkit-nbsp-mode: space; line-break:
                      after-white-space;" class="">
                      <blockquote type="cite" class="">On Dec 18, 2018,
                        at 10:18 PM, Adam Nemet <<a
                          href="mailto:anemet@apple.com" class=""
                          moz-do-not-send="true">anemet@apple.com</a>>
                        wrote:<br class="">
                      </blockquote>
                      <div class="">
                        <blockquote type="cite" class=""><br class="">
                          <div class="">
                            <div dir="auto" class="">
                              <blockquote type="cite"
                                style="font-family: Helvetica;
                                font-size: 12px; font-style: normal;
                                font-variant-caps: normal; font-weight:
                                normal; letter-spacing: normal; orphans:
                                auto; text-align: start; text-indent:
                                0px; text-transform: none; white-space:
                                normal; widows: auto; word-spacing: 0px;
                                -webkit-text-size-adjust: auto;
                                -webkit-text-stroke-width: 0px;
                                text-decoration: none;" class="">
                                <div dir="ltr" class="">
                                  <div class="">
                                    <div class="">I don’t understand
                                      this.  What is the benefit of
                                      providing layout info to element
                                      wise operations?  This defeats the
                                      goal of having simple lowering and
                                      representation: you are encoding
                                      an ND vector form into the IR in a
                                      really ugly way, and this will
                                      cause a proliferation of
                                      intrinsics that are redundant with
                                      the core ops.</div>
                                  </div>
                                </div>
                              </blockquote>
                              <div style="caret-color: rgb(0, 0, 0);
                                font-family: Helvetica; font-size: 12px;
                                font-style: normal; font-variant-caps:
                                normal; font-weight: normal;
                                letter-spacing: normal; text-align:
                                start; text-indent: 0px; text-transform:
                                none; white-space: normal; word-spacing:
                                0px; -webkit-text-stroke-width: 0px;
                                text-decoration: none;" class=""><br
                                  class="">
                              </div>
                              <div style="caret-color: rgb(0, 0, 0);
                                font-family: Helvetica; font-size: 12px;
                                font-style: normal; font-variant-caps:
                                normal; font-weight: normal;
                                letter-spacing: normal; text-align:
                                start; text-indent: 0px; text-transform:
                                none; white-space: normal; word-spacing:
                                0px; -webkit-text-stroke-width: 0px;
                                text-decoration: none;" class="">The
                                reason we need that information so that
                                for example we can lower an operation on
                                a 3-element column into a vector of 2
                                and a scalar op.  This should be
                                beneficial for power consumption since
                                for example in the case of a 3x3 with a
                                single element padding rather than
                                operating on 12 elements you’d operate
                                only on 9 (vector ops consume more power
                                than their scalar counterparts).</div>
                              <div style="caret-color: rgb(0, 0, 0);
                                font-family: Helvetica; font-size: 12px;
                                font-style: normal; font-variant-caps:
                                normal; font-weight: normal;
                                letter-spacing: normal; text-align:
                                start; text-indent: 0px; text-transform:
                                none; white-space: normal; word-spacing:
                                0px; -webkit-text-stroke-width: 0px;
                                text-decoration: none;" class=""><br
                                  class="">
                              </div>
                              <div style="caret-color: rgb(0, 0, 0);
                                font-family: Helvetica; font-size: 12px;
                                font-style: normal; font-variant-caps:
                                normal; font-weight: normal;
                                letter-spacing: normal; text-align:
                                start; text-indent: 0px; text-transform:
                                none; white-space: normal; word-spacing:
                                0px; -webkit-text-stroke-width: 0px;
                                text-decoration: none;" class="">That
                                said we should be able to remove these
                                intrinsics in the long term.  Once we
                                have masking on the core ops in the IR,
                                we should be able to express the same
                                semantics without dedicated intrinsics.</div>
                            </div>
                          </div>
                        </blockquote>
                        <br class="">
                      </div>
                      <div class="">There may be some cases where this
                        holds (maybe with 5x5 or something), but most of
                        the time I would expect to get better power from
                        doing a four-element vector op with one wasted
                        lane than doing two arithmetic ops (plus
                        possibly extracts and inserts, depending on
                        physical layout details).</div>
                      <div class=""><br class="">
                      </div>
                      <div class="">Explicit masking or arranging for
                        zero in padding lanes seems like a better way
                        forward to me.</div>
                      <div class="">– Steve</div>
                    </div>
                  </div>
                </blockquote>
                <br class="">
              </div>
              <div class="">I spent some time chatting with Adam about
                this and have a better understanding of his concerns
                here. It seems to me that if having masking intrinsics
                is the long-term solution we want, we should do that now
                (for add and sub) rather than building arbitrary matrix
                layout info into intrinsics, since a mask has all the
                information that we actually need.</div>
            </div>
          </div>
        </blockquote>
        <div><br class="">
        </div>
        <div>I think that sounds like a reasonable compromise.  We
          already have masked load/store intrinsics so adding add and
          sub just follows that precedent.  If the decision is made to
          move masking to the core operations, the new intrinsics would
          just move as well.</div>
        <div><br class="">
        </div>
        <div>So an add->multiply for option B + masking intrinsics
          would look like this:</div>
        <div><br class="">
        </div>
        <div>
          <div style="margin: 0px; font-stretch: normal; font-size:
            10px; line-height: normal; font-family: Menlo;" class="">
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">  <span style="color: #ca30c7" class="">%a</span>
              = <span style="color: #c91b00" class="">load</span>
              <12 x float>, <12 x float>* <span
                style="color: #ca30c7" class="">%A</span>, align 16</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">  <span style="color: #ca30c7" class="">%b</span>
              = <span style="color: #c91b00" class="">load</span>
              <12 x float>, <12 x float>* <span
                style="color: #ca30c7" class="">%B</span>, align 16</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">  <span style="color: #ca30c7" class="">%c</span>
              = <span style="color: #c91b00" class="">load</span> <8
              x float>, <8 x float>* <span style="color:
                #ca30c7" class="">%C</span>, align 16</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal; min-height: 11px;" class=""><br class="">
            </div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">  <span style="color: #ca30c7" class="">%add</span>
              = <span style="color: #c91b00" class="">call</span>
              <12 x float> @llvm.masked.fadd(<12 x float> <span
                style="color: #ca30c7" class="">%a</span>, <12 x
              float> <span style="color: #ca30c7" class="">%b</span>,</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal; color: rgb(2, 37, 199);" class=""><span
                style="color: #000000" class="">      <span class="Apple-tab-span" style="white-space:pre">                                        </span>
                    </span>; mask, if false element is taken from
              passthrough</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">                                        
                  <12 x i1> <i1 <span style="color: #c91b00"
                class="">true</span>, <span style="color: #0225c7"
                class="">i1</span> <span style="color: #c91b00"
                class="">true</span>, <span style="color: #0225c7"
                class="">i1</span> <span style="color: #c91b00"
                class="">true</span>, <span style="color: #0225c7"
                class="">i1</span> <span style="color: #c91b00"
                class="">false</span>,</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">                                       
                              <span style="color: #0225c7" class="">i1</span>
              <span style="color: #c91b00" class="">true</span>, <span
                style="color: #0225c7" class="">i1</span> <span
                style="color: #c91b00" class="">true</span>, <span
                style="color: #0225c7" class="">i1</span> <span
                style="color: #c91b00" class="">true</span>, <span
                style="color: #0225c7" class="">i1</span> <span
                style="color: #c91b00" class="">false</span>,</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">                                       
                              <span style="color: #0225c7" class="">i1</span>
              <span style="color: #c91b00" class="">true</span>, <span
                style="color: #0225c7" class="">i1</span> <span
                style="color: #c91b00" class="">true</span>, <span
                style="color: #0225c7" class="">i1</span> <span
                style="color: #c91b00" class="">true</span>, <span
                style="color: #0225c7" class="">i1</span> <span
                style="color: #c91b00" class="">false</span> ></div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">                                        
                  <span style="color: #0225c7" class="">; passthrough:</span></div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">                                        
                  <12 x float> <float <span style="color:
                #c91b00" class="">undef</span>, <span style="color:
                #0225c7" class="">float</span> <span style="color:
                #c91b00" class="">undef</span>, <span style="color:
                #0225c7" class="">float</span> <span style="color:
                #c91b00" class="">undef</span>, <span style="color:
                #0225c7" class="">float</span> <span style="color:
                #c91b00" class="">undef</span>,</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">                                        
                                <span style="color: #0225c7" class="">float</span>
              <span style="color: #c91b00" class="">undef</span>, <span
                style="color: #0225c7" class="">float</span> <span
                style="color: #c91b00" class="">undef</span>, <span
                style="color: #0225c7" class="">float</span> <span
                style="color: #c91b00" class="">undef</span>, <span
                style="color: #0225c7" class="">float</span> <span
                style="color: #c91b00" class="">undef</span>,</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">                                        
                                <span style="color: #0225c7" class="">float</span>
              <span style="color: #c91b00" class="">undef</span>, <span
                style="color: #0225c7" class="">float</span> <span
                style="color: #c91b00" class="">undef</span>, <span
                style="color: #0225c7" class="">float</span> <span
                style="color: #c91b00" class="">undef</span>, <span
                style="color: #0225c7" class="">float</span> <span
                style="color: #c91b00" class="">undef</span> >)</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal; min-height: 11px;" class=""><br class="">
            </div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">  <span style="color: #ca30c7" class="">%mul</span>
              = <span style="color: #c91b00" class="">call</span> <8
              x float> @llvm.matrix.multiply(<12 x float> <span
                style="color: #ca30c7" class="">%add</span>, <8 x
              float> <span style="color: #ca30c7" class="">%c</span>,</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">                                        
                    <span style="color: #0225c7" class="">;     3 x 3  
                          3 x 2  column-major:</span></div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">                                       
                      <span style="color: #0225c7" class="">i32</span>
              3, <span style="color: #0225c7" class="">i32</span> 3,  
                <span style="color: #0225c7" class="">i32</span> 3, <span
                style="color: #0225c7" class="">i32</span> 2,     <span
                style="color: #0225c7" class="">i1</span> <span
                style="color: #c91b00" class="">true</span>)</div>
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class="">  <span style="color: #c91b00" class="">store</span>
              <8 x float> <span style="color: #ca30c7" class="">%mul</span>,
              <8 x float>* <span style="color: #ca30c7" class="">%MUL</span>,
              align 16</div>
          </div>
        </div>
      </div>
    </blockquote>
    We've started an RFC that proposes exactly this:
    <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D53613">https://reviews.llvm.org/D53613</a>
    <p>The RFC proposes intrinsics that take a mask and an explicit
      vector length argument. The explicit vector length is aimed at
      RISC-V V and NEC SX-Aurora and it can be legalized away for
      targets that do not support it (eg AVX512). We also propose a
      couple of new attributes that should help with function call
      vectorization.<br>
    </p>
    <p>I'll present this in Zurich at the upcoming LLVM Social on
      January, 10th for people who are interested. I also talked about a
      bit about this at the last DevMtg (from ~15:00 in
      <a class="moz-txt-link-freetext" href="https://youtu.be/BAZClv6nMxY">https://youtu.be/BAZClv6nMxY</a>).<br>
    </p>
    <p>- Simon</p>
    <p><br>
    </p>
    <blockquote type="cite"
      cite="mid:C5ABF5BF-B27C-46FF-B6F5-4CC30922C859@apple.com">
      <div>
        <div>
          <div style="margin: 0px; font-stretch: normal; font-size:
            10px; line-height: normal; font-family: Menlo;" class="">
            <div style="margin: 0px; font-stretch: normal; line-height:
              normal;" class=""><br class="">
            </div>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 

Simon Moll
Researcher / PhD Student

Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : <a class="moz-txt-link-abbreviated" href="mailto:moll@cs.uni-saarland.de">moll@cs.uni-saarland.de</a>
Fax. +49 (0)681 302-3065  : <a class="moz-txt-link-freetext" href="http://compilers.cs.uni-saarland.de/people/moll">http://compilers.cs.uni-saarland.de/people/moll</a></pre>
  </body>
</html>