<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <br>
    <div class="moz-cite-prefix">On 06/26/2015 11:07 AM, David Majnemer
      wrote:<br>
    </div>
    <blockquote
cite="mid:CAL7bZ_c-6vmT9AUTT=5Vv3rr6oWm3Ggz-TjcgNBup=r=H8MqYA@mail.gmail.com"
      type="cite">
      <div dir="ltr"><br>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On Fri, Jun 26, 2015 at 9:38 AM,
            Philip Reames <span dir="ltr"><<a moz-do-not-send="true"
                href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"><span class="">
                  <div>On 06/26/2015 08:42 AM, David Majnemer wrote:<br>
                  </div>
                  <blockquote type="cite">
                    <div dir="ltr"><br>
                      <div class="gmail_extra"><br>
                        <div class="gmail_quote">On Fri, Jun 26, 2015 at
                          7:00 AM, Paweł Bylica <span dir="ltr"><<a
                              moz-do-not-send="true"
                              href="mailto:chfast@gmail.com"
                              target="_blank">chfast@gmail.com</a>></span>
                          wrote:<br>
                          <blockquote class="gmail_quote"
                            style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                            <div dir="ltr">Hi,
                              <div><br>
                              </div>
                              <div>Let's have a simple program:</div>
                              <div>
                                <div>define i32 @main(i32 %n, i64 %idx)
                                  {</div>
                                <div>  %idxSafe = trunc i64 %idx to i5</div>
                                <div>  %r = extractelement <4 x
                                  i32> <i32 -1, i32 -1, i32 -1,
                                  i32 -1>, i64 %idx</div>
                                <div>  ret i32 %r</div>
                                <div>}</div>
                              </div>
                              <div><br>
                              </div>
                              <div>The assembly of that would be:</div>
                              <div>
                                <div><span style="white-space:pre-wrap">
                                  </span>pcmpeqd<span
                                    style="white-space:pre-wrap"> </span>%xmm0,
                                  %xmm0</div>
                                <div><span style="white-space:pre-wrap">
                                  </span>movdqa<span
                                    style="white-space:pre-wrap"> </span>%xmm0,

                                  -24(%rsp)</div>
                                <div><span style="white-space:pre-wrap">
                                  </span>movl<span
                                    style="white-space:pre-wrap"> </span>-24(%rsp,%rsi,4),

                                  %eax</div>
                                <div><span style="white-space:pre-wrap">
                                  </span>retq</div>
                              </div>
                              <div><br>
                              </div>
                              <div>The language reference states that
                                the extractelement instruction produces
                                undefined value in case the index
                                argument is invalid (our case). But the
                                implementation simply dumps the vector
                                to the stack memory, calculates the
                                memory offset out of the index value and
                                tries to access the memory. That causes
                                the crash.</div>
                              <div><br>
                              </div>
                              <div>The workaround is to trunc the index
                                value before extractelement (see
                                %idxSafe). But what should be the
                                ultimate solution?</div>
                            </div>
                          </blockquote>
                          <div><br>
                          </div>
                          <div>We could fix this by specifying that out
                            of bounds access on an extractelement leads
                            to full-on undefined behavior, no need to
                            force everyone to eat the cost of a mask.<br>
                          </div>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                </span> This seems like the appropriate decision to me. 
                It's closely in line with existing practice and
                assumptions.  <br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>The only problem that I can see by specifying it this
              way is that they cannot be speculatively executed,
              isSafeToSpeculativelyExecute believes it is currently safe
              to do so.  I can see why speculating this instruction
              might be good. Perhaps we should emit a mask...<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Hm, yuck.  Hadn't thought about that one.  <br>
    <br>
    One option would to let extractelements with provably in bounds
    entries be speculated, but not others.  <br>
    <br>
    Another option might be to have a mask emitted by the code that is
    speculating it.<br>
    <br>
    I'm not sure how bad either scheme would actually be in practice. 
    Almost all of the extractelements I see in optimized IR have
    constant indices.  <br>
    <br>
    Philip<br>
  </body>
</html>