<div dir="ltr"><div dir="ltr">On Tue, 21 Apr 2020 at 16:39, Kees Cook via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Tue, Apr 21, 2020 at 06:44:44PM -0400, Arthur O'Dwyer wrote:<br>
> On Tue, Apr 21, 2020 at 6:12 PM Richard Smith via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>> wrote:<br>
> > What you're proposing is, without question, a language extension. Our<br>
> > policy on language extensions is documented here:<br>
> > <a href="http://clang.llvm.org/get_involved.html" rel="noreferrer" target="_blank">http://clang.llvm.org/get_involved.html</a><br>
> ><br>
> > Right now, this fails at point 4. We do not want to create or encourage<br>
> > the creation of language dialects and non-portable code, so the place to<br>
> > have this discussion is in the C and C++ committees. Both committees have<br>
> > processes for specifying optional features these days, and they might be<br>
> > amenable to using those processes to standardize the behavior you're asking<br>
> > for. (I mean, maybe not, but our policy requires that you at least try.)<br>
<br>
Well, I wasn't intending to re-discuss the presence of the feature. I was<br>
trying to point out that it does exist in Clang already, and it exists<br>
in other compilers too (and is being rapidly embraced by projects). So,<br>
language extension or not, this behavior is present in released C/C++<br>
binaries. Do you mean that _because_ it's present, it should be brought<br>
to the committee for standardization? If so, what's the right path there?<br></blockquote><div><br></div><div>The existence of the --long-ugly-flag-name-that-says-we'll-remove-the-feature is the way we currently try to avoid introducing a language dialect. If we remove that flag as is proposed, then we are effectively relitigating the question of whether to have the feature at all.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> > However, there is a variant on what you're proposing that might fare<br>
> > better: instead of guaranteeing zero-initialization, we could guarantee<br>
> > that any observation of an uninitialized variable *either* gives produces<br>
> > zero or results in a trap. That is: it's still undefined to read from<br>
> > uninitialized variables -- we still do not guarantee what will happen if<br>
> > you do, and will warn on uninitialized uses and so on -- but we would bound<br>
> > the damage that can result from such accesses. You would get the security<br>
> > hardening benefits with the modest binary size impact. That approach would<br>
> > not introduce the risk of creating a language dialect (at least, not to the<br>
> > same extent), so our policy on avoiding language extensions would not apply.<br>
<br>
While I like the trap idea, I must say that I don't have a lot of<br>
confidence that it can be done in a way that would actually provide the<br>
same coverage benefits. How would such detection work? There are entire<br>
suites of tools (e.g. KMSan) that exist for this kind of thing and are not<br>
really suited for production use because it's so expensive to implement.<br></blockquote><div><br></div><div>The idea is that we would eventually emit exactly the same code as with the zero-init approach, except with some additional IR-level markers that would allow control flow paths that always read an uninitialized value to be optimized into a trap instruction. (For example: instead of initializing with a literal zero, you initialize with a call to an intrinsic function. If you can see that an instruction that's allowed to fault depends on a value produced by such a function, then you replace it with a trap. And at some stage of the optimization pipeline, you replace all such calls to that intrinsic function with literal zeroes.)</div><div><br></div><div>We might still produce zeroes instead of traps almost all the time. I think that's OK, so long as we trap enough that "you're using a variable that's uninitialized" receives an "oops" response instead of an "it's OK, the compiler will definitely zero it" response.</div><div><br></div><div>And indeed it might even be OK if the initial behavior is that we *always* zero-initialize (as Philip asked), so long as our documentation clearly says that we do not guarantee that the value will be zero (only that we guarantee that *if the program continues*, the value will be zero), and our intent is that we may still produce traps or otherwise abort the computation.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> I dont understand the point you're making here. You're saying that if<br>
> Clang provides reliable behavior, that's a language extension and therefore<br>
> impossible; but if Clang provides behavior that unpredictably switches<br>
> between only two possible alternatives (zero or trap), then that's not a<br>
> language extension anymore and therefore is possible?<br>
<br>
Er, is that what was meant? I mean, don't we kind of already have this<br>
state in Clang already with "pattern" or "zero"? I'm just hoping to drop<br>
the -enable... flag.<br>
<br>
> I suspect many of the people quoted in the original quote-fest would not be<br>
> happy with "zero *or trap*" as the two behaviors.<br>
> What if you made it "zero *or one*"? That is, whenever you access an<br>
> uninitialized variable, you are guaranteed to get either all-bits-zero or<br>
> else all-bits-zero-except-for-the-last-bit-which-is-1? Would that<br>
> selection of two behaviors leave matters sufficiently unspecified so as to<br>
> dodge this "language extension" nonsense?<br>
<br>
Speaking with my kernel security flaw mitigation hat on, no, zero-init is<br>
the desired feature unless trap is equally or less expensive, in which<br>
case trap is great because then the kernel will warn and then set it<br>
to zero.<br>
<br>
-- <br>
Kees Cook<br>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote></div></div>