[llvm-dev] [RFC] Introducing a byte type to LLVM

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 15 14:38:15 PDT 2021

On Tue, Jun 15, 2021 at 12:16 PM Ralf Jung via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi,
> > The semantics you seem to want are that LLVM’s integer types cannot
> carry
> > information from pointers. But I can cast a pointer to an integer in C
> and
> > vice-versa, and compilers have de facto defined the behavior of
> subsequent
> > operations like breaking the integer up (and then putting it back
> together),
> > adding numbers to it, and so on. So no, as a C compiler writer, I do not
> have a
> > choice; I will have to use a type that can validly carry pointer
> information for
> > integers in C.
> Integers demonstrably do not carry provenance; see
> <https://www.ralfj.de/blog/2020/12/14/provenance.html> for a detailed
> explanation of why.
> As a consequence of this, ptr-int-ptr roundtrips are lossy: some of the
> original
> provenance information is lost. This means that optimizing away such
> roundtrips
> is incorrect, and indeed doing so leads to miscompilations
> (https://bugs.llvm.org/show_bug.cgi?id=34548).
> The key difference between int and byte is that ptr-byte-ptr roundtrips
> are
> *lossless*, all the provenance is preserved. This means some extra
> optimizations
> (such as removing these roundtrips -- which implicitly happens when a
> redundant-store-after-load is removed), but also some lost optimizations
> (most
> notably, "x == y" does not mean x and y are equal in all respects; their
> provenance might still differ, so it is incorrect for GVN to replace one
> my the
> other).
> It's a classic tradeoff: we can *either* have lossless roundtrips

I think an important part of explaining the motivation for "byte" would be
an explanation/demonstration of what the cost of losing "lossless
roundtrips" would be.

> *or* "x == y"
> implies full equality of the abstract values. Having both together leads
> to
> contradictions, which manifest as miscompilations. "byte" and "int"
> represent
> the two possible choices here; therefore, by adding "byte", LLVM would
> close a
> gap in the expressive power of its IR.
> Kind regards,
> Ralf
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210615/d6e78374/attachment.html>

More information about the llvm-dev mailing list