[llvm-dev] [cfe-dev] Demystifying the byte type

George Mitenkov via llvm-dev llvm-dev at lists.llvm.org
Sat Oct 16 04:46:15 PDT 2021


>
> But, there are CPUs out there that have special instructions for doing
> pointer manipulation.
>
I am not sure I see why this is relevant to the byte type?

I believe that the issue is not that relevant to some specific architecture
as it stems from the frontend and IR optimizations. As I say in the post,
it is tied to C semantics saying that unsigned char is a universal type
holder. This means that any pointer can be copied byte-per-byte without
alias analysis in LLVM realising that, leading to incorrect IR. LLVM does
not have such a type and uses untyped memory with integers carrying data
(which as described in the post creates inconsistencies and invalidates
certain LLVM optimizations).

Thanks,
George

On Sat, Oct 16, 2021 at 1:37 PM George Mitenkov <georgemitenk0v at gmail.com>
wrote:

> Hi James,
>
>
>> If what you describe is correct, doesn't that imply that the
>> opaque pointer work is a fools errand ?
>> I.e. If memory needs to be typed, surely pointers to that memory
>> also need to be typed?
>
> Not at all, the issue described in the post is orthogonal to opaque
> pointers. The main problem that the post talks about is that LLVM does not
> have a type that represents a raw sequence of bits (i.e. memory).
> Currently, integers are used for that, which makes them carry pointers
> sometimes (as described in first parts of the post). This creates a problem
> for optimizations on integers, since we do not know that the values that we
> load are "pure" real integers or pointers casted to ints (and different
> LLVM optimizations make different assumptions about that).
>
> So it is not about making memory "typed", but rather creating a universal
> type that can be used to load from memory something that carries raw data
> (integers and pointers) and preserves provenance.
>
> Hope this helps!
>
> Thanks,
> George
>
> On Sat, Oct 16, 2021 at 1:25 PM James Dutton <james.dutton at gmail.com>
> wrote:
>
>> Hi,
>>
>> The gist post, seems to imply that one needs memory to be typed.
>> If what you describe is correct, doesn't that imply that the opaque
>> pointer work is a fools errand ?
>> I.e. If memory needs to be typed, surely pointers to that memory also
>> need to be typed?
>>
>> Kind Regards
>>
>> James
>>
>>
>> On Fri, 15 Oct 2021 at 19:41, George Mitenkov via cfe-dev
>> <cfe-dev at lists.llvm.org> wrote:
>> >
>> >
>> > Hi all,
>> >
>> > In May 2021, together with Nuno Lopes and Juneyoung Lee, we proposed to
>> add a byte type in LLVM to fix load type punning issues. Initial RFC
>> touched some subtle aspects of LLVM IR and its semantics, and sparked a lot
>> of questions, concerns, and discussions.
>> >
>> > We decided to write a post that would summarise the thread and the
>> complicated topic:
>> >
>> > https://gist.github.com/georgemitenkov/3def898b8845c2cc161bd216cbbdb81f
>> >
>> > We hope that our post clarifies initial concerns raised on the mailing
>> list. As always, any questions, suggestions and advice are welcome!
>> >
>> > Thanks,
>> > George
>> > _______________________________________________
>> > cfe-dev mailing list
>> > cfe-dev at lists.llvm.org
>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211016/a3963d09/attachment.html>


More information about the llvm-dev mailing list