<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">On 1/31/14 5:23 PM, Nick Lewycky wrote:<br>

    </div>

    <blockquote

cite="mid:CADbEz-jg7-UdWK77G5+u5n4ninz-H1wFpZGyeqv-GWr1hK-DUg@mail.gmail.com"

      type="cite">

      <div dir="ltr">On 30 January 2014 09:55, Philip Reames <span

          dir="ltr"><<a moz-do-not-send="true"

            href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span>

        wrote:<br>

        <div class="gmail_extra">

          <div class="gmail_quote">

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

              <div class="im">On 1/29/14 3:40 PM, Nick Lewycky wrote:<br>

                <blockquote class="gmail_quote" style="margin:0px 0px

                  0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">The

                  LLVM Module has an optional target triple and target

                  datalayout. Without them, an llvm::DataLayout can't be

                  constructed with meaningful data. The benefit to

                  making them optional is to permit optimization that

                  would work across all possible DataLayouts, then allow

                  us to commit to a particular one at a later point in

                  time, thereby performing more optimization in advance.<br>

                  <br>

                  This feature is not being used. Instead, every user of

                  LLVM IR in a portability system defines one or more

                  standardized datalayouts for their platform, and shims

                  to place calls with the outside world. The primary

                  reason for this is that independence from DataLayout

                  is not sufficient to achieve portability because it

                  doesn't also represent ABI lowering constraints. If

                  you have a system that attempts to use LLVM IR in a

                  portable fashion and does it without standardizing on

                  a datalayout, please share your experience.<br>

                </blockquote>

              </div>

              Nick, I don't have a current system in place, but I do

              want to put forward an alternate perspective.<br>

              <br>

              We've been looking at doing late insertion of safepoints

              for garbage collection.  One of the properties that we end

              up needing to preserve through all the optimizations which

              precede our custom rewriting phase is that the optimizer

              has not chosen to "hide" pointers from us by using

              ptrtoint and integer math tricks. Currently, we're simply

              running a verification pass before our rewrite, but I'm

              very interested long term in constructing ways to ensure a

              "gc safe" set of optimization passes.<br>

            </blockquote>

            <div><br>

            </div>

            <div>

              <div>As a general rule passes need to support the whole of

                what the IR can support. Trying to operate on a subset

                of IR seems like a losing battle, unless you can show a

                mapping from one to the other (ie., using code

                duplication to remove all unnatural loops from IR, or

                collapsing a function to having a single exit node).</div>

            </div>

            <div><br>

            </div>

            <div>What language were you planning to do this for? Does

              the language permit the user to convert pointers to

              integers and vice versa? If so, what do you do if the user

              program writes a pointer out to a file, reads it back in

              later, and uses it?</div>

          </div>

        </div>

      </div>

    </blockquote>

    Java - which does not permit arbitrary pointer manipulation.  (Well,

    without resorting to mechanism like JNI and sun.misc.Unsafe.  Doing

    so would be explicitly undefined behavior though.)  We also use raw

    pointer manipulations in our implementation (which is eventually

    inlined), but this happens after the safepoint insertion rewrite.<br>

    <br>

    We strictly control the input IR.  As a result, I can insure that

    the initial IR meets our subset requirements.  In practice, all of

    the opto passes appear to preserve these invariants (i.e. not

    introducing inttoptr), but we'd like to justify that a bit more.  <br>

    <blockquote

cite="mid:CADbEz-jg7-UdWK77G5+u5n4ninz-H1wFpZGyeqv-GWr1hK-DUg@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">One

              of the ways I've been thinking about - but haven't

              actually implemented yet - is to deny the optimization

              passes information about pointer sizing.</blockquote>

            <div><br>

            </div>

            <div>Right, pointer size (address space size) will become

              known to all parts of the compiler. It's not even going to

              be just the optimizations, ConstantExpr::get is going to

              grow smarter because of this, as

              lib/Analysis/ConstantFolding.cpp merges into

              lib/IR/ConstantFold.cpp. That is one of the major benefits

              that's driving this. (All parts of the compiler will also

              know endian-ness, which means we can constant fold loads,

              too.)</div>

          </div>

        </div>

      </div>

    </blockquote>

    I would argue that all of the pieces you mentioned are performing

    optimizations.  :)  However, the exact semantics are unimportant for

    the overall discussion.  <br>

    <blockquote

cite="mid:CADbEz-jg7-UdWK77G5+u5n4ninz-H1wFpZGyeqv-GWr1hK-DUg@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Under

              the assumption that an opto pass can't insert an ptrtoint

              cast without knowing a safe integer size to use, this

              seems like it would outlaw a class of optimizations we'd

              be broken by.<br>

            </blockquote>

            <div><br>

            </div>

            <div>Optimization passes generally prefer converting

              ptrtoint and inttoptr to GEPs whenever possible. </div>

          </div>

        </div>

      </div>

    </blockquote>

    This is good to hear and helps us.<br>

    <blockquote

cite="mid:CADbEz-jg7-UdWK77G5+u5n4ninz-H1wFpZGyeqv-GWr1hK-DUg@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div>I expect that we'll end up with *fewer* ptr<->int

              conversions with this change, because we'll know enough

              about the target to convert them into GEPs.</div>

          </div>

        </div>

      </div>

    </blockquote>

    Er, I'm confused by this.  Why would not knowing the size of a

    pointer case a GEP to be converted to a ptr <-> int

    conversion?  <br>

    <br>

    Or do you mean that after the change conversions in the original

    input IR are more likely to be recognized?<br>

    <blockquote

cite="mid:CADbEz-jg7-UdWK77G5+u5n4ninz-H1wFpZGyeqv-GWr1hK-DUg@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">My

              understanding is that the only current way to do this

              would be to not specify a DataLayout.  (And hack a few

              places with built in assumptions.  Let's ignore that for

              the moment.)  With your proposed change, would there be a

              clean way to express something like this?<br>

            </blockquote>

            <div><br>

            </div>

            <div>I think your GC placement algorithm needs to handle

              inttoptr and ptrtoint, whichever way this discussion goes.

              Sorry. I'd be happy to hear others chime in -- I know I'm

              not an expert in this area or about GCs -- but I don't

              find this rationale compelling.</div>

          </div>

        </div>

      </div>

    </blockquote>

    The key assumption I didn't initially explain is that the initial IR

    couldn't contain conversions.  With that added, do you still see

    concerns?  I'm fairly sure I don't need to handle general ptr

    <-> int conversions.  If I'm wrong, I'd really like to know

    it. 

    <blockquote

cite="mid:CADbEz-jg7-UdWK77G5+u5n4ninz-H1wFpZGyeqv-GWr1hK-DUg@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">p.s.

              From reading the mailing list a while back, I suspect that

              the SPIR folks might have similar needs.  (i.e. hiding

              pointer sizes, etc..)  Pure speculation on my part though.<br>

            </blockquote>

            <div><br>

            </div>

            <div>The SPIR spec specifies two target datalayouts, one for

              32 bits and one for 64 bits.</div>

          </div>

        </div>

      </div>

    </blockquote>

    Good to know.  Thanks.<br>

    <blockquote

cite="mid:CADbEz-jg7-UdWK77G5+u5n4ninz-H1wFpZGyeqv-GWr1hK-DUg@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <div>Nick</div>

            <div><br>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    Philip<br>

  </body>

</html>