<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:587278603;
mso-list-template-ids:-1857636340;}
@list l0:level1
{mso-level-start-at:3;
mso-level-tab-stop:36.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l1
{mso-list-id:1039668669;
mso-list-template-ids:1532918370;}
@list l2
{mso-list-id:1523741464;
mso-list-type:hybrid;
mso-list-template-ids:1110862066 67698705 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l2:level1
{mso-level-text:"%1\)";
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l2:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l2:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l2:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l2:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l2:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l2:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l2:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l2:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l3
{mso-list-id:1547064497;
mso-list-template-ids:-1915295888;}
@list l3:level1
{mso-level-start-at:2;
mso-level-tab-stop:36.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link="#0563C1" vlink="#954F72" style='word-wrap:break-word'><div class=WordSection1><p class=MsoNormal>As far as I know there<span style='font-family:"Times New Roman",serif'>’</span>s no one working on this stuff.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The plan is to restrict the <span style='font-family:"Times New Roman",serif'>“</span>inttoptr(ptrtoint(x))->x” optimization to the safe cases. And then make alias analysis less conservative when dealing with inttoptr (instead of always giving up). Plus make sure optimizations don’t produce inttoptr (this bit has improved *<b>a lot</b>* in the past year).<o:p></o:p></p><p class=MsoNormal>There isn’t a major pressure to fix all this, as inttoptr is not very common (in most C/C++ programs, at least).<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>That said, this year we got many applications for google summer of code of students wanting to fix bugs, so this one actually came to my mind. Let’s see how many slots we get..<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Nuno<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b>From:</b> Joseph Tremoulet <jotrem@microsoft.com> <br><b>Sent:</b> 16 April 2021 18:20<br><b>To:</b> Nuno Lopes <nunoplopes@sapo.pt><br><b>Cc:</b> llvm-dev@lists.llvm.org<br><b>Subject:</b> RE: [EXTERNAL] RE: [llvm-dev] inttoptr and noalias returns<o:p></o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Thank you, that’s super helpful.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Do we have plans/proposals for how to avoid this? I gather it involves stopping optimization from blindly folding inttoptr(ptrtoint(x))->x, and you’ve mentioned making sure we avoid introducing inttoptr+ptrtoint unnecessarily, is the plan just those things? You also mentioned augmenting inttoptr w/ inbounds and that the folding is “correct in some cases”, does that mean we have plans (or at least a desire) to formulate refined rules for when the folding is possible that will allow more optimization? The slide deck discusses separating the notions of logical pointers vs physical pointers, is that something that anybody is working on changing the code to model?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>For context, I’m working on a front-end for a language that I can’t change whose type system doesn’t really distinguish between native pointers and pointer-sized integers, so I can only do so much to avoid creating ptrtoint/inttoptr in the first place. But there are some constructs we can recognize as allocations, I’m hoping to be able to iteratively re-type arithmetic trees rooted at those as pointers/geps.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Thanks,<o:p></o:p></p><p class=MsoNormal>-Joseph<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b>From:</b> Nuno Lopes <<a href="mailto:nunoplopes@sapo.pt">nunoplopes@sapo.pt</a>> <br><b>Sent:</b> Friday, April 16, 2021 12:48 PM<br><b>To:</b> Joseph Tremoulet <<a href="mailto:jotrem@microsoft.com">jotrem@microsoft.com</a>><br><b>Cc:</b> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br><b>Subject:</b> [EXTERNAL] RE: [llvm-dev] inttoptr and noalias returns<o:p></o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>That<span style='font-family:"Times New Roman",serif'>’</span>s a very long story.. let me try to summarize why you can<span style='font-family:"Times New Roman",serif'>’</span>t do <span style='font-family:"Times New Roman",serif'>“</span>inttoptr(ptrtoint(x)) -> x” *<b>blindly</b>* (it’s correct in some cases).<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><ol style='margin-top:0cm' start=1 type=1><li class=MsoListParagraph style='margin-left:0cm;mso-list:l2 level1 lfo3'>Integers carry no provenance information and can be interchanged at will.<o:p></o:p></li></ol><p class=MsoNormal>This means that this transformation is always correct:<o:p></o:p></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>if (x == y)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'> f(x);<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>=><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>if (x == y)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'> f(y);<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><ol style='margin-top:0cm' start=2 type=1><li class=MsoListParagraph style='margin-left:0cm;mso-list:l2 level1 lfo3'>There are many pointers whose addresses are equal. For example:<o:p></o:p></li></ol><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>char p[n];<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>char q[m];<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>char r[3];<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>We may have that (int)(p+n) == (int)q == (int)(r-m).<o:p></o:p></p><p class=MsoNormal>Even if we focus just on inbounds pointers (because we e.g. augmented inttoptr to have an inbounds tag), we can still have 2 pointers with the same address: p+n & q.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><ol style='margin-top:0cm' start=3 type=1><li class=MsoListParagraph style='margin-left:0cm;mso-list:l2 level1 lfo3'>Pointers have provenance. You can<span style='font-family:"Times New Roman",serif'>’</span>t use p+n to change memory of q.<o:p></o:p></li></ol><p class=MsoNormal>p[n] = 42; // UB, to make the life of the alias analysis easier<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>If we put the three pieces together, we get that it<span style='font-family:"Times New Roman",serif'>’</span>s possible for the compiler to swap a ptrtoint of a dereferenceable pointer with something else and then if you blindly fold the ptrtoint/inttoptr chain, you get a wrong pointer. Something like:<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>int x = p + n;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>int y = q;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>if (x == y)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'> *(char*)y = 3;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>=> (GVN)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>int x = p + n;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>int y = q;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>if (x == y)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'> *(char*)x = 3;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>=> (invalid fold of inttoptr/ptrtoin chain)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>int x = p + n;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>int y = q;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>if (x == y)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'> *(p+n) = 3;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>=> (access OOB is UB)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>int x = p + n;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>int y = q;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'>if (x == y)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:Consolas'> UB;<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I<span style='font-family:"Times New Roman",serif'>’</span>ve a few slides on LLVM<span style='font-family:"Times New Roman",serif'>’</span>s AA that may help: <a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fweb.ist.utl.pt%2Fnuno.lopes%2Fpres%2Fpointers-eurollvm18.pptx&data=04%7C01%7Cjotrem%40microsoft.com%7Cc59392ee73544672120908d900f76d30%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637541886804434751%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ww2BElNYZQn%2FReivAtAt4BZfrK7LwoZvsmL2NDqmRqA%3D&reserved=0">https://web.ist.utl.pt/nuno.lopes/pres/pointers-eurollvm18.pptx</a><o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Nuno<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b>From:</b> Joseph Tremoulet <<a href="mailto:jotrem@microsoft.com">jotrem@microsoft.com</a>> <br><b>Sent:</b> 16 April 2021 15:48<br><b>To:</b> Nuno Lopes <<a href="mailto:nunoplopes@sapo.pt">nunoplopes@sapo.pt</a>><br><b>Cc:</b> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br><b>Subject:</b> RE: [EXTERNAL] RE: [llvm-dev] inttoptr and noalias returns<o:p></o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>> otherwise relies on the incorrect transformation <span style='font-family:"Times New Roman",serif'>“</span>inttoptr(ptrtoint(x)) -> x<span style='font-family:"Times New Roman",serif'>”</span><o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Could you point me to an example/explanation of why that transformation is incorrect? It’s not clear to me from the LangRef.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>> A big issue with LLVM<span style='font-family:"Times New Roman",serif'>’</span>s static analysis is caching, since everything is done lazily. If you want to add something more expensive to BasicAA, you need to make sure that information is cached somehow to avoid recomputing it a thousand times. Compilation time is quite sensitive to the performance of BasicAA.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The IsCapturedCache in AAQueryInfo is pretty close to what I’m after, but I don’t really understand why the code in aliasCheck is using the weaker isEscapeSource as opposed to !isNonEscapingLocalObject.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>> escape pointers just like ptrtoint does<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Yeah, so if the rule for ptrtoint is simply that the source pointer escapes, then I’d think we could take advantage of the flip side of that and isEscapeSource could return true for inttoptr, without needing expensive analysis/caching. But I know this can be a subtle area, so I’m not sure that’s the rule. I see [1] that Ryan Taylor added discussing it to the agenda for the February AA conference call, I’m curious what the outcome of that was.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Thanks,<o:p></o:p></p><p class=MsoNormal>-Joseph<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>1 - <a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.llvm.org%2Fpipermail%2Fllvm-dev%2F2021-February%2F148671.html&data=04%7C01%7Cjotrem%40microsoft.com%7Cc59392ee73544672120908d900f76d30%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637541886804444708%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0%2BKVGy1WVr26nQOq%2BWsbXi73DNigGSq%2Fzh87%2Bwqsmng%3D&reserved=0">https://lists.llvm.org/pipermail/llvm-dev/2021-February/148671.html</a><o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b>From:</b> Nuno Lopes <<a href="mailto:nunoplopes@sapo.pt">nunoplopes@sapo.pt</a>> <br><b>Sent:</b> Thursday, April 15, 2021 6:37 AM<br><b>To:</b> Joseph Tremoulet <<a href="mailto:jotrem@microsoft.com">jotrem@microsoft.com</a>><br><b>Cc:</b> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br><b>Subject:</b> [EXTERNAL] RE: [llvm-dev] inttoptr and noalias returns<o:p></o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>You<span style='font-family:"Times New Roman",serif'>’</span>re right that LLVM is very conservative in handling inttoptr. And otherwise relies on the incorrect transformation <span style='font-family:"Times New Roman",serif'>“</span>inttoptr(ptrtoint(x)) -> x<span style='font-family:"Times New Roman",serif'>”</span> to get rid of inttoptr.<o:p></o:p></p><p class=MsoNormal>I agree the store should have been removed in your second example. I guess inttoptr is not frequently used, and even less after a bunch of fixes to prevent optimizers from creating new ones.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>BasicAA is quite basic, but that<span style='font-family:"Times New Roman",serif'>’</span>s all LLVM has. The other alias analyses in git are either not useful in practice, unfinished or buggy. (I haven<span style='font-family:"Times New Roman",serif'>’</span>t looked into that dir in a couple of years, so things may have changed in the meantime).<o:p></o:p></p><p class=MsoNormal>A big issue with LLVM<span style='font-family:"Times New Roman",serif'>’</span>s static analysis is caching, since everything is done lazily. If you want to add something more expensive to BasicAA, you need to make sure that information is cached somehow to avoid recomputing it a thousand times. Compilation time is quite sensitive to the performance of BasicAA.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Although there<span style='font-family:"Times New Roman",serif'>’</span>s no definitive semantics for pointer comparisons yet (soonish I hope), LLVM<span style='font-family:"Times New Roman",serif'>’</span>s behavior implies that pointer comparisons indeed escape pointers just like ptrtoint does (except if the two pointers being compared are inbounds and point to the same object, and therefore the comparison is only around offsets and thus their address doesn<span style='font-family:"Times New Roman",serif'>’</span>t leak).<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Nuno<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b>From:</b> llvm-dev <<a href="mailto:llvm-dev-bounces@lists.llvm.org">llvm-dev-bounces@lists.llvm.org</a>> <b>On Behalf Of </b>Joseph Tremoulet via llvm-dev<br><b>Sent:</b> 02 April 2021 19:26<br><b>To:</b> llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br><b>Subject:</b> Re: [llvm-dev] inttoptr and noalias returns<o:p></o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Stepping through this in the debugger, I see this code in BasicAliasAnalysis doing a check similar to the sort that I would have expected to see proving NoAlias for this case, but it’s not because (ISTM) it’s being pretty conservative:<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // If one pointer is the result of a call/invoke or load and the other is a<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // non-escaping local object within the same function, then we know the<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // object couldn't escape to a point where the call could return it.<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> //<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // Note that if the pointers are in different functions, there are a<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // variety of complications. A call with a nocapture argument may still<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // temporary store the nocapture argument's value in a temporary memory<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // location if that memory location doesn't escape. Or it may pass a<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // nocapture value to other functions as long as they don't capture it.<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> if (isEscapeSource(O1) &&<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> isNonEscapingLocalObject(O2, &AAQI.IsCapturedCache))<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> return NoAlias;<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> if (isEscapeSource(O2) &&<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> isNonEscapingLocalObject(O1, &AAQI.IsCapturedCache))<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> return NoAlias;<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> }</span><o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>and<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>/// Returns true if the pointer is one which would have been considered an<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>/// escape by isNonEscapingLocalObject.<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>static bool isEscapeSource(const Value *V) {<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> if (isa<CallBase>(V))<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> return true;<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'><o:p> </o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> if (isa<Argument>(V))<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> return true;<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'><o:p> </o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // The load case works because isNonEscapingLocalObject considers all<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // stores to be escapes (it passes true for the StoreCaptures argument<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> // to PointerMayBeCaptured).<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> if (isa<LoadInst>(V))<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> return true;<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'><o:p> </o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> return false;<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>}</span><o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Since we have to look through all the uses of O1/O2 (including certain transitive ones) to prove isNonEscapingLocalObject, an expensive-but-more-precise analysis could just check if O2/O1 is in that set, IIUC. I get why BasicAliasAnalysis isn’t the right place to do that. Is there some more expensive alias analysis that I could opt into and get that sort of check?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Alternatively, following the logic that we can assume isEscapeSource for loads because we treat stores as escapes, is there room to assume isEscapeSource for inttoptrs because we treat ptrtoints, and things that let you subtly intify pointers such as certain compares, as escapes?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Thanks,<o:p></o:p></p><p class=MsoNormal>-Joseph<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b>From:</b> llvm-dev <<a href="mailto:llvm-dev-bounces@lists.llvm.org">llvm-dev-bounces@lists.llvm.org</a>> <b>On Behalf Of </b>Joseph Tremoulet via llvm-dev<br><b>Sent:</b> Wednesday, March 31, 2021 2:09 PM<br><b>To:</b> llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br><b>Subject:</b> [EXTERNAL] [llvm-dev] inttoptr and noalias returns<o:p></o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Hi,<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I’m a bit confused about the interaction between inttoptr and noalias, and would like to better understand our model.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I realize there’s a bunch of in-flight work around restrict modeling and that ptrtoint was on the agenda for last week’s AA call. I’m interested in understanding both the current state and the thinking/plans for the future. And I’m happy for pointers to anywhere this is already written down, I didn’t find it from skimming the AA call minutes or the mailing list archive, but I could easily have overlooked it, and haven’t really dug into the set of restrict patches (nor do I know where to get a list of those).<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I also realize that with aliasing questions there can always be a gap between what the model says we can infer and how aggressive analyses and optimizations are about actually making use of those inferences. Again I’m interested in both answers (and happy for either).<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>In the LangRef section on pointer aliasing rules [1], I see<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal style='margin-left:36.0pt'>An integer constant other than zero or a pointer value returned from a function not defined within LLVM may be associated with address ranges allocated through mechanisms other than those provided by LLVM. Such ranges shall not overlap with any ranges of addresses allocated by mechanisms provided by LLVM.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>And I’m curious what “mechanisms provided by LLVM” for allocation means. Alloca, presumably. Global variables? Certain intrinsics? Any function with a noalias return value?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>In the LangRef description of the noalias attribute [2], I see<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal style='margin-left:36.0pt'>This indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value … On function return values, the noalias attribute indicates that the function acts like a system memory allocation function, returning a pointer to allocated storage disjoint from the storage for any other object accessible to the caller.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The phrase “the storage for any other object accessible to the caller” in the noalias description sounds like a broader category than the phrase “mechanisms provided by LLVM” from the pointer aliasing section, so I would expect that if the pointer returned from a call to a function with return attribute noalias does not escape, then loads/stores through it would not alias loads/stores through a pointer produced by inttoptr. Am I interpreting that correctly?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I wrote some snippets [3] to see what the optimizer would do. Each case has a store of value 86 via pointer %p that I’d expect dead store elimination to remove if we think it does not alias the subsequent load via pointer %q (because immediately after that is another store to %p).<o:p></o:p></p><p class=MsoNormal>In each case, %q is the result of a call to a function whose return value is annotated noalias.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>When %p is a pointer parameter, I indeed see the optimizer removing the dead store:<o:p></o:p></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>define i8 @test1(i8* %p) {<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'><o:p> </o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> %q = call i8* @allocate()<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> store i8 86, i8* %p ; <-- this gets removed<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> %result = load i8, i8* %q<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> store i8 0, i8* %p<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> ret i8 %result<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>}<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>When %p is the result of inttoptr, I do not see the store being removed, and I’m wondering if this is because of a subtle aliasing rule or an intentional conservativism in the optimizer or just a blind spot in the analysis:<o:p></o:p></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>define i8 @test2(i64 %p_as_int) {<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> %p = inttoptr i64 %p_as_int to i8*<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'><o:p> </o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> %q = call i8* @allocate()<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> store i8 86, i8* %p ; <-- this does not get removed<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> %result = load i8, i8* %q<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> store i8 0, i8* %p<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> ret i8 %result<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>}<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>When I outline the inttoptr into a separate function, I again see the optimizer remove the dead store, which again I’m wondering if the difference between this and the previous case is an intentional subtle point or what.<o:p></o:p></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>define i8* @launder(i64 %int) noinline {<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> %ptr = inttoptr i64 %int to i8*<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> ret i8* %ptr<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>}<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'><o:p> </o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>define i8 @test3(i64 %p_as_int) {<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> %p = call i8* @launder(i64 %p_as_int)<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'><o:p> </o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> %q = call i8* @allocate()<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> store i8 86, i8* %p ; <-- this gets removed<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> %result = load i8, i8* %q<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> store i8 0, i8* %p<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'> ret i8 %result<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='font-family:"Courier New"'>}</span><o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Happy for any insights you can share.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Thanks,<o:p></o:p></p><p class=MsoNormal>-Joseph<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>1 - <a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fllvm.org%2Fdocs%2FLangRef.html%23pointeraliasing&data=04%7C01%7Cjotrem%40microsoft.com%7Cc59392ee73544672120908d900f76d30%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637541886804444708%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=C8%2FXcULT99VQpDlsOO1WEOhY%2FMGeCwhY9m7y72%2FlStI%3D&reserved=0">https://llvm.org/docs/LangRef.html#pointeraliasing</a><o:p></o:p></p><p class=MsoNormal>2 - <a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fllvm.org%2Fdocs%2FLangRef.html%23parameter-attributes&data=04%7C01%7Cjotrem%40microsoft.com%7Cc59392ee73544672120908d900f76d30%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637541886804454666%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=L%2BRLfoLcnu0KqMjxrVR6jATwl%2FcpgbQDZtSM5sJ7TRc%3D&reserved=0">https://llvm.org/docs/LangRef.html#parameter-attributes</a><o:p></o:p></p><p class=MsoNormal>3 - <a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgodbolt.org%2Fz%2Fx8e41G33Y&data=04%7C01%7Cjotrem%40microsoft.com%7Cc59392ee73544672120908d900f76d30%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637541886804454666%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=e7QCklMraT%2Fs%2FWBuSRJwROVErYbCLiIAPmuRuccMXnA%3D&reserved=0">https://godbolt.org/z/x8e41G33Y</a><o:p></o:p></p></div></body></html>