[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

Wed Jun 9 15:57:47 PDT 2021

On 6/9/21 15:06, Philip Reames via llvm-dev wrote:
>
>
> On 6/5/21 9:26 PM, Chris Lattner via llvm-dev wrote:
>> On Jun 4, 2021, at 11:25 AM, John McCall via cfe-dev 
>> <cfe-dev at lists.llvm.org> wrote:On 4 Jun 2021, at 11:24, George 
>> Mitenkov wrote:
>>>
>>>     Hi all,
>>>
>>>     Together with Nuno Lopes and Juneyoung Lee we propose to add a
>>>     new byte
>>>     type to LLVM to fix miscompilations due to load type punning.
>>>     Please see
>>>     the proposal below. It would be great to hear the
>>>     feedback/comments/suggestions!
>>>
>>>
>>>     Motivation
>>>     ==========
>>>
>>>     char and unsigned char are considered to be universal holders in
>>>     C. They
>>>     can access raw memory and are used to implement memcpy. i8 is
>>>     the LLVM’s
>>>     counterpart but it does not have such semantics, which is also not
>>>     desirable as it would disable many optimizations.
>>>
>>> I don’t believe this is correct. LLVM does not have an innate
>>> concept of typed memory. The type of a global or local allocation
>>> is just a roundabout way of giving it a size and default alignment,
>>> and similarly the type of a load or store just determines the width
>>> and default alignment of the access. There are no restrictions on
>>> what types can be used to load or store from certain objects.
>>>
>>> C-style type aliasing restrictions are imposed using |tbaa|
>>> metadata, which are unrelated to the IR type of the access.
>>>
>> I completely agree with John.  “i8” in LLVM doesn’t carry any 
>> implications about aliasing (in fact, LLVM pointers are going towards 
>> being typeless).  Any such thing occurs at the accesses, and are part 
>> of TBAA.
>>
>> I’m opposed to adding a byte type to LLVM, as such semantic carrying 
>> types are entirely unprecedented, and would add tremendous complexity 
>> to the entire system.
>
> I agree with both John and Chris here.
>
> I've read through the discussion in this thread, and have yet to be 
> convinced there is a problem, much less than this is a good solution.  
> I'm open to being convinced of those two things, but the writeup in 
> this thread doesn't do it.  There's snippet of examples downthread 
> which might be convincing, but there's objections raised around 
> language semantics which I find very hard to follow.  The 
> fragmentation of the thread really doesn't help.
>
> I would suggest the OP take some of the motivating examples, write up 
> a web-page with the examples and their interpretation, then revisit 
> the topic.  In particular, I strongly suggest anticipating incorrect 
> interpretation/objections and explicitly addressing them.
>
> I'll also note that the use of the term capture w.r.t a *load* 
> downthread makes absolutely no sense to me.  Stores capture, not loads.
>
> Philip
>

I agree.

Also, while there certainly is a problem combining GVN with our 
use-def-chain-based pointer-aliasing/provenance semantics, it is in no 
way clear to me how a byte type helps resolve that problem.

  -Hal

>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210609/3c40c423/attachment.html>