<div dir="ltr"><div dir="ltr">On Tue, 21 Apr 2020 at 16:39, Kees Cook via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Tue, Apr 21, 2020 at 06:44:44PM -0400, Arthur O'Dwyer wrote:<br>

> On Tue, Apr 21, 2020 at 6:12 PM Richard Smith via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>> wrote:<br>

> > What you're proposing is, without question, a language extension. Our<br>

> > policy on language extensions is documented here:<br>

> > <a href="http://clang.llvm.org/get_involved.html" rel="noreferrer" target="_blank">http://clang.llvm.org/get_involved.html</a><br>

> ><br>

> > Right now, this fails at point 4. We do not want to create or encourage<br>

> > the creation of language dialects and non-portable code, so the place to<br>

> > have this discussion is in the C and C++ committees. Both committees have<br>

> > processes for specifying optional features these days, and they might be<br>

> > amenable to using those processes to standardize the behavior you're asking<br>

> > for. (I mean, maybe not, but our policy requires that you at least try.)<br>

<br>

Well, I wasn't intending to re-discuss the presence of the feature. I was<br>

trying to point out that it does exist in Clang already, and it exists<br>

in other compilers too (and is being rapidly embraced by projects). So,<br>

language extension or not, this behavior is present in released C/C++<br>

binaries. Do you mean that _because_ it's present, it should be brought<br>

to the committee for standardization? If so, what's the right path there?<br></blockquote><div><br></div><div>The existence of the --long-ugly-flag-name-that-says-we'll-remove-the-feature is the way we currently try to avoid introducing a language dialect. If we remove that flag as is proposed, then we are effectively relitigating the question of whether to have the feature at all.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> > However, there is a variant on what you're proposing that might fare<br>

> > better: instead of guaranteeing zero-initialization, we could guarantee<br>

> > that any observation of an uninitialized variable *either* gives produces<br>

> > zero or results in a trap. That is: it's still undefined to read from<br>

> > uninitialized variables -- we still do not guarantee what will happen if<br>

> > you do, and will warn on uninitialized uses and so on -- but we would bound<br>

> > the damage that can result from such accesses. You would get the security<br>

> > hardening benefits with the modest binary size impact. That approach would<br>

> > not introduce the risk of creating a language dialect (at least, not to the<br>

> > same extent), so our policy on avoiding language extensions would not apply.<br>

<br>

While I like the trap idea, I must say that I don't have a lot of<br>

confidence that it can be done in a way that would actually provide the<br>

same coverage benefits. How would such detection work? There are entire<br>

suites of tools (e.g. KMSan) that exist for this kind of thing and are not<br>

really suited for production use because it's so expensive to implement.<br></blockquote><div><br></div><div>The idea is that we would eventually emit exactly the same code as with the zero-init approach, except with some additional IR-level markers that would allow control flow paths that always read an uninitialized value to be optimized into a trap instruction. (For example: instead of initializing with a literal zero, you initialize with a call to an intrinsic function. If you can see that an instruction that's allowed to fault depends on a value produced by such a function, then you replace it with a trap. And at some stage of the optimization pipeline, you replace all such calls to that intrinsic function with literal zeroes.)</div><div><br></div><div>We might still produce zeroes instead of traps almost all the time. I think that's OK, so long as we trap enough that "you're using a variable that's uninitialized" receives an "oops" response instead of an "it's OK, the compiler will definitely zero it" response.</div><div><br></div><div>And indeed it might even be OK if the initial behavior is that we *always* zero-initialize (as Philip asked), so long as our documentation clearly says that we do not guarantee that the value will be zero (only that we guarantee that *if the program continues*, the value will be zero), and our intent is that we may still produce traps or otherwise abort the computation.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> I dont understand the point you're making here.  You're saying that if<br>

> Clang provides reliable behavior, that's a language extension and therefore<br>

> impossible; but if Clang provides behavior that unpredictably switches<br>

> between only two possible alternatives (zero or trap), then that's not a<br>

> language extension anymore and therefore is possible?<br>

<br>

Er, is that what was meant? I mean, don't we kind of already have this<br>

state in Clang already with "pattern" or "zero"? I'm just hoping to drop<br>

the -enable... flag.<br>

<br>

> I suspect many of the people quoted in the original quote-fest would not be<br>

> happy with "zero *or trap*" as the two behaviors.<br>

> What if you made it "zero *or one*"?  That is, whenever you access an<br>

> uninitialized variable, you are guaranteed to get either all-bits-zero or<br>

> else all-bits-zero-except-for-the-last-bit-which-is-1?  Would that<br>

> selection of two behaviors leave matters sufficiently unspecified so as to<br>

> dodge this "language extension" nonsense?<br>

<br>

Speaking with my kernel security flaw mitigation hat on, no, zero-init is<br>

the desired feature unless trap is equally or less expensive, in which<br>

case trap is great because then the kernel will warn and then set it<br>

to zero.<br>

<br>

-- <br>

Kees Cook<br>

_______________________________________________<br>

cfe-dev mailing list<br>

<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

</blockquote></div></div>