<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi Madhur,</div><div dir="ltr"><div><br></div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">I can argue that if b8 is proposed then why not b16 </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">for half? Why not b32 for some other reason? </blockquote><div>We do propose to have b<N> for different Ns, as per proposal:</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><span style="color:rgb(0,0,0);font-family:Arial;font-size:14.666666984558105px;white-space:pre-wrap">we denote the byte type as b<N>, where N is the number of bits.</span></blockquote><div>Maybe that was not explicit enough. But note that the only byte bit width produced</div><div>by the frontend is b8 (from char, unsigned char/std::byte). The other bit widths</div><div>come from LLVM optimizations, such as already mentioned memcpy:</div></div><div><br></div><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%src8 = bitcast i8** %src to i8*</p>

<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%dst8 = bitcast i8** %dst to i8*</p>

<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dst8, i8* %src8, i32 8, i1 false)<span style="font-family:Arial,Helvetica,sans-serif;font-size:small"> </span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><span style="font-family:Arial,Helvetica,sans-serif;font-size:small"><br></span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><span style="font-family:Arial,Helvetica,sans-serif;font-size:small">is transformed (by instcombine) into </span></p><div><br></div><div><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%src64 = bitcast i8** %src to i64*</p>

<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%dst64 = bitcast i8** %dst to i64*</p>

<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%val = load i64, i64* %src64</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">store i64 %val, i64* %dst64</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">What we propose is to have roughly</p><p style="color:rgb(0,0,0);margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%src64 = bitcast i8** %src to <b>b64</b>*</p><p style="color:rgb(0,0,0);margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%dst64 = bitcast i8** %dst to <b>b64</b>*</p><p style="color:rgb(0,0,0);margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%val = load <b>b64</b>, <b>b64</b>* %src64</p><p style="color:rgb(0,0,0);margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">store <b>b64</b> %val, <b>b64</b>* %dst64</p><p style="color:rgb(0,0,0);margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="color:rgb(0,0,0);margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">Having this just copies memory as-is, and cannot introduce implicit</p><p style="color:rgb(0,0,0);margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">ptr2int/int2ptr casts.</p><p style="color:rgb(0,0,0);margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Given the problem, <b>I'd say we should think about a way to annotate<br></b><b>types with attribute or metadata or flags which optimizations can use<br></b><b>to do a better job.</b> The attribute/metadata could carry the semantic <br>meaning for the type. Frontends can generate this "type attribute/metadata" <br>and optimizations can choose to use this extra information to do the<br>better job. It would be a hint though and not a mandate for optimizations.<br>This approach is very similar to attributes in LLVM IR and just like an IR<br>function can have attributes, a type can also posses attributes/metadata. <br>(Whether it should be an attribute or metadata is a choice but <br>that would not deviate from the purpose).</blockquote><div>I see your point and it is definitely easier to fix everything with metadata/attributes.</div><div>I am concerned with this approach because it just postpones a bigger problem.</div><div>There are already lots of metadata, attributes and IR/optimizations have become</div><div>complicated enough. Instead of fixing a problem <i><b>while we can</b></i>, we just add attributes,</div><div>then more attributes... I believe that at some point this will just become legacy and</div><div>impossible to fix. </div><div><br></div><div>Thanks,</div><div>George</div><div><br></div><div></div></div></div></div></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Jun 6, 2021 at 12:32 PM Madhur Amilkanthwar via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div>HI George,</div><div><br></div><div>I don't think this is scalable model to add a new type just to benefit <br></div><div>an analysis and draw specific conclusions from it. I can argue that</div><div>if b8 is proposed then why not b16 for half? Why not b32 for some</div><div>other reason? This won't stop just there and one can go beyond</div><div>and introduce types to benefit domain specific languages.</div><div><br></div><div>Given the problem, <b>I'd say we should think about a way to annotate</b></div><div><b> types with attribute or metadata or flags which optimizations can use</b></div><div><b>to do a better job.</b> The attribute/metadata could carry the semantic <br></div><div>meaning for the type. Frontends can generate this "type attribute/metadata" <br></div><div>and optimizations can choose to use this extra information to do the</div><div>better job. It would be a hint though and not a mandate for optimizations.</div><div>This approach is very similar to attributes in LLVM IR and just like an IR</div><div>function can have attributes, a type can also posses attributes/metadata. <br></div><div></div><div>(Whether it should be an attribute or metadata is a choice but <br></div><div>that would not deviate from the purpose).<br></div><div><br></div><div>This approach is far more adoptable and convincing than introducing</div><div> a whole new type which would be massive complexity for the type system.</div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Jun 6, 2021 at 2:32 PM James Courtier-Dutton via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Also, the comment below is wrong. At this point, arr3 is equivalent to<br>

arr2, which is q.<br>

<br>

 // Now arr3 is equivalent to arr1, which is p.<br>

  int *r;<br>

  memcpy(&r, (unsigned char *)arr3, sizeof(r));<br>

  // Now r is p.<br>

  *p = 1;<br>

  *r = 10;<br>

<br>

<br>

<br>

On Sun, 6 Jun 2021 at 08:54, James Courtier-Dutton<br>

<<a href="mailto:james.dutton@gmail.com" target="_blank">james.dutton@gmail.com</a>> wrote:<br>

><br>

> Hi,<br>

><br>

> I would also oppose adding a byte type, but mainly because the bug<br>

> report mentioned (<a href="https://bugs.llvm.org/show_bug.cgi?id=37469" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=37469</a>) is not<br>

> a bug at all.<br>

> The example in the bug report is just badly written C code.<br>

> Specifically:<br>

><br>

> int main() {<br>

>   int A[4], B[4];<br>

>   printf("%p %p\n", A, &B[4]);<br>

>   if ((uintptr_t)A == (uintptr_t)&B[4]) {<br>

>     store_10_to_p(A, &B[4]);<br>

>     printf("%d\n", A[0]);<br>

>   }<br>

>   return 0;<br>

> }<br>

><br>

> "int B[4];" allows values between 0 and 3 only, and referring to 4 in<br>

> &B[4] is undef, so in my view, it is correctly optimised out which is<br>

> why it disappears in -O3.<br>

><br>

> Kind Regards<br>

><br>

> James<br>

><br>

><br>

> On Sun, 6 Jun 2021 at 05:26, Chris Lattner via cfe-dev<br>

> <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>> wrote:<br>

> ><br>

> > On Jun 4, 2021, at 11:25 AM, John McCall via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>> wrote:On 4 Jun 2021, at 11:24, George Mitenkov wrote:<br>

> ><br>

> > Hi all,<br>

> ><br>

> > Together with Nuno Lopes and Juneyoung Lee we propose to add a new byte<br>

> > type to LLVM to fix miscompilations due to load type punning. Please see<br>

> > the proposal below. It would be great to hear the<br>

> > feedback/comments/suggestions!<br>

> ><br>

> ><br>

> > Motivation<br>

> > ==========<br>

> ><br>

> > char and unsigned char are considered to be universal holders in C. They<br>

> > can access raw memory and are used to implement memcpy. i8 is the LLVM’s<br>

> > counterpart but it does not have such semantics, which is also not<br>

> > desirable as it would disable many optimizations.<br>

> ><br>

> > I don’t believe this is correct. LLVM does not have an innate<br>

> > concept of typed memory. The type of a global or local allocation<br>

> > is just a roundabout way of giving it a size and default alignment,<br>

> > and similarly the type of a load or store just determines the width<br>

> > and default alignment of the access. There are no restrictions on<br>

> > what types can be used to load or store from certain objects.<br>

> ><br>

> > C-style type aliasing restrictions are imposed using tbaa<br>

> > metadata, which are unrelated to the IR type of the access.<br>

> ><br>

> > I completely agree with John.  “i8” in LLVM doesn’t carry any implications about aliasing (in fact, LLVM pointers are going towards being typeless).  Any such thing occurs at the accesses, and are part of TBAA.<br>

> ><br>

> > I’m opposed to adding a byte type to LLVM, as such semantic carrying types are entirely unprecedented, and would add tremendous complexity to the entire system.<br>

> ><br>

> > -Chris<br>

> ><br>

> > _______________________________________________<br>

> > cfe-dev mailing list<br>

> > <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

> > <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div><br clear="all"><br>-- <br><div dir="ltr"><div dir="ltr"><div><i style="font-size:12.8px">Disclaimer: Views, concerns, thoughts, questions, ideas expressed in this mail are of my own and my employer has no take in it. </i><br></div><div>Thank You.<br>Madhur D. Amilkanthwar<br><br></div></div></div></div>

_______________________________________________<br>

cfe-dev mailing list<br>

<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

</blockquote></div>