[cfe-dev] [llvm-dev] [RFC] Introducing a byte type to LLVM

Nicolai Hähnle via cfe-dev cfe-dev at lists.llvm.org
Thu Jun 10 22:47:14 PDT 2021


I have written a longer article that resulted as a byproduct of thinking
through the problem space of this proposal:
https://nhaehnle.blogspot.com/2021/06/can-memcpy-be-implemented-in-llvm-ir.html

What happened is that I ended up questioning some really fundamental
things, like, can we even implement memcpy? :) The answer is a qualified
Yes, but I found it to be a good framework for thinking about the
fundamentals of what is discussed here, so I published this in the hope
that others find it useful.

tl;dr: This discussion is ultimately all about pointer provenance. There is
a gap in the expressiveness of LLVM IR when it comes to that, with
surprising consequences for memcpy (and similar operations). From an
aesthetics point of view, filling this gap has a lot of appeal, and the
"byte" proposal points in that direction. However, I have some issues with
the details of the proposal, and it is so intrusive that it needs to be
justified by more than just aesthetics.

The correctness issues in the problem space can be solved by much less
intrusive means. The justification for the more intrusive means would be
better alias analysis, but I don't think this case has been built well
enough so far. We should also consider alternatives (though I don't think
there are any that are truly simple).

Apart from that, we need to be much more precise in our documentation of
pointer provenance in LangRef (e.g.: what does llvm.memcpy do, exactly --
the mentioned bug 37469 could technically be a bug in the loop idiom
recognizer!), and I like the idea of an `unrestrict(p)` instruction as a
simpler and more evocative spelling of `inttoptr(ptrtoint(p))`.

I would also like to better understand how this interacts with the C99
"restrict" work that Jeroen pointed out. Overall, this is an important
discussion to have but I feel we're only at the very beginning.

tl;dr of the tl;dr: It's complicated :)

Cheers,
Nicolai

On Thu, Jun 10, 2021 at 1:15 AM Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
> On 6/9/21 12:03, Chris Lattner wrote:
> > On Jun 6, 2021, at 8:52 AM, Hal Finkel <hal.finkel.llvm at gmail.com>
> wrote:
> >> I'll take this opportunity to point out that, at least historically,
> >> the reason why a desire to optimize around ptrtoint keeps resurfacing
> >> is because:
> >>
> >>  1. Common optimizations introduce them into code that did not
> >> otherwise have them (SROA, for example, see convertValue in SROA.cpp).
> >>
> >>  2. They're generated by some of the ABI code for argument passing
> >> (see clang/lib/CodeGen/TargetInfo.cpp).
> >>
> >>  3. They're present in certain performance-sensitive code idioms
> >> (see, for example, ADT/PointerIntPair.h).
> >>
> >> It seems to me that, if there's design work to do in this area, one
> >> should consider addressing these now-long-standing issues where we
> >> introduce ptrtoint by replacing this mechanism with some other one.
> >>
> > I completely agree.  These all have different solutions, I’d prefer to
> > tackle them one by one.
> >
> > -Chris
> >
>
> I agree, these different problems have three different solutions. Also,
> let me add that I see three quasi-separable discussions here (accounting
> for past discussions on the same topic):
>
>   1. Do we have a consistency problem with how we treat pointers and
> their provenance information? The answer here is yes (see, e.g., the GVN
> examples from this thread).
>
>   2. Do we need to do more than be as conservative as possible around
> ptrtoint/inttoptr usages? This is relevant because trying to be clever
> here is often where inconsistencies around our pointer semantics are
> exposed, although it's not always the case that problems involve
> inttoptr. Addressing the points I raised above will lessen the
> motivation to be more aggressive here (although, in itself, that will
> not fix the semantic inconsistencies around pointers).
>
>   3. Does introducing a byte type help resolve the semantic issues
> around pointers? I don't yet understand why this might help.
>
> Thanks again,
>
> Hal
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>


-- 
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210611/5e8c87e3/attachment-0001.html>


More information about the cfe-dev mailing list