<!DOCTYPE html>

<html>

<head>

<meta http-equiv="Content-Type" content="text/xhtml; charset=utf-8">

</head>

<body>

<div style="font-family:sans-serif"><div style="white-space:normal">

<p dir="auto">On 22 Jun 2021, at 13:21, Ralf Jung wrote:</p>


</div>

<div style="white-space:normal"><blockquote style="border-left:2px solid #3983C4; color:#3983C4; margin:0 0 5px; padding-left:5px"><blockquote style="border-left:2px solid #3983C4; color:#7CBF0C; margin:0 0 5px; padding-left:5px; border-left-color:#7CBF0C"><p dir="auto">Your proposal generally relies on certain optimizations not applying<br>

to pointers because they mess up provenance as represented in<br>

transitive use-dependencies. If those optimizations can be applied<br>

to integers, you can lose use-dependencies in exactly the same way as<br>

you can with pointers. Not doing the inttoptr(ptrtoint p)) -> p<br>

reduction doesn’t matter at all, because in either case that value<br>

has a use-dependency on p, whereas the actual problem is that there<br>

is no longer a use-dependency on some other value.</p>

</blockquote><p dir="auto">Note that "provenance" as we use it in this discussion is an *explicit operational artifact* -- it exists as a concrete piece of state in the Abstract Machine. That is very different from something that might just be used internally in some kind of analysis.<br>

<br>

There is no problem with "resetting" that provenance on a "inttoptr", and basically forgetting about where the int comes from. Note that this is a statement about an operation in the Abstract Machine, not merely a statement about some analysis: this is not "forgetting" as in "safely overapproximating the real set of possible behaviors", it is "forgetting" by *forcing* the provenance to be some kind of wildcard/default provenance. All analyses then have to correctly account for that.</p>

</blockquote></div>

<div style="white-space:normal">


<p dir="auto">But it’s not a truly wildcard provenance.  At the very least, it’s<br>

restricted to the set of provenances that have been exposed, and<br>

my understanding is that Juneyoung’s non-determinism rule attempts<br>

to readmit a use-dependency analysis and turn <code>ptrtoint</code> back into<br>

a simple scalar operation.</p>


</div>

<div style="white-space:normal"><blockquote style="border-left:2px solid #3983C4; color:#3983C4; margin:0 0 5px; padding-left:5px"><blockquote style="border-left:2px solid #3983C4; color:#7CBF0C; margin:0 0 5px; padding-left:5px; border-left-color:#7CBF0C"><p dir="auto">For example, you have compellingly argued that it’s problematic to<br>

do the reduction |a == b ? a : b| to |b| for pointer types. Suppose<br>

I instead do this optimization on integers where |a = ptrtoint A|.<br>

The result is now simply |b|. If I |inttoptr| that result and access<br>

the memory, there will be no record that that access may validly<br>

be to |A|. It does not help that the access may be represented<br>

as |inttoptr (ptrtoint B)| for some |B| rather than just directly<br>

to |B|, because there is no use-dependence on |A|. All there is<br>

is an apparently unrelated and unused |ptrtoint A|.</p>

</blockquote><p dir="auto">So that would be "ptrtoint A == ptrtoint B ? ptrtoint A : ptrtoint B" being replaced by "ptrtoint B"? I don't see any problem with that. Do you have a concrete example?</p>

</blockquote></div>

<div style="white-space:normal">


<p dir="auto">I think you can take any example where pointer-type restrictions<br>

would be necessary to protect against miscompilation and turn it<br>

into an example here by just inserting <code>inttoptr</code> and <code>ptrtoint</code><br>

appropriately.  Quick example:</p>


<pre style="border:thin solid gray; margin-left:15px; margin-right:15px; max-width:90vw; overflow-x:auto; padding:5px"><code>int A = 0x1;

int B = 0x2;

long a = (long) (A+1);

long b = (long) B;

long result = (a == b ? a : b);

if (a == b)

  *(((int*) result) - 1) = 0x4;

else

  *((int*) result) = 0x8;

printf(“%d %d\n”, A, B);

</code></pre>


<p dir="auto">I submit that this program has unspecified but not undefined behavior,<br>

with printing “1 8” and “4 2” being the only two valid outcomes.<br>

But I think an optimizer which changes the fifth line to<br>

<code>long result = b;</code> without leaving any trace behind could easily<br>

compile this to print “1 2” because there would be nothing to<br>

prevent the initialization of <code>A</code> from being forwarded to the<br>

final load.</p>


<p dir="auto">You can prevent this by noting that the provenance of <code>A</code> has been<br>

xposed and allowing the <code>inttoptr</code> of <code>result</code> to alias <code>A</code>, but<br>

that seems inconsistent with treating <code>ptrtoint</code> as a simple scalar<br>

operation and allowing a use-analysis of a <code>ptrtoint</code> to restrict<br>

which <code>inttoptr</code> casts are allowed to recreate provenance for the<br>

<code>ptrtoint</code> operand.</p>


<p dir="auto">If you want to keep treating <code>ptrtoint</code> as a scalar operation and<br>

doing use-analyses on it, I think the most palatable option is to<br>

recognize whenever you’re cutting a use-dependency and<br>

conservatively record in IR that the original value has now been<br>

exposed.  So if you start with this:</p>


<pre style="border:thin solid gray; margin-left:15px; margin-right:15px; max-width:90vw; overflow-x:auto; padding:5px"><code>  %eq = icmp eq i32 %a, %b

  %result = select i1 %eq, i32 %a, i32 %b

</code></pre>


<p dir="auto">You have to transform it like this:</p>


<pre style="border:thin solid gray; margin-left:15px; margin-right:15px; max-width:90vw; overflow-x:auto; padding:5px"><code>  %result = %b

  call void @llvm.expose.i32(i32 %a)

</code></pre>


<p dir="auto">You should be able to remove these exposure events in a lot of<br>

situations, but conservatively they’ll have to be treated as<br>

escapes.</p>


<p dir="auto">Most optimizations never cut use-dependencies on opaque values<br>

like this and so won’t be affected.</p>


<p dir="auto">John.</p>

</div>

</div>

</body>

</html>