[cfe-dev] making -ftrivial-auto-var-init=zero a first-class option

Tue Apr 21 16:54:25 PDT 2020

On Tue, 21 Apr 2020 at 16:39, Kees Cook via cfe-dev <cfe-dev at lists.llvm.org>
wrote:

> On Tue, Apr 21, 2020 at 06:44:44PM -0400, Arthur O'Dwyer wrote:
> > On Tue, Apr 21, 2020 at 6:12 PM Richard Smith via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
> > > What you're proposing is, without question, a language extension. Our
> > > policy on language extensions is documented here:
> > > http://clang.llvm.org/get_involved.html
> > >
> > > Right now, this fails at point 4. We do not want to create or encourage
> > > the creation of language dialects and non-portable code, so the place
> to
> > > have this discussion is in the C and C++ committees. Both committees
> have
> > > processes for specifying optional features these days, and they might
> be
> > > amenable to using those processes to standardize the behavior you're
> asking
> > > for. (I mean, maybe not, but our policy requires that you at least
> try.)
>
> Well, I wasn't intending to re-discuss the presence of the feature. I was
> trying to point out that it does exist in Clang already, and it exists
> in other compilers too (and is being rapidly embraced by projects). So,
> language extension or not, this behavior is present in released C/C++
> binaries. Do you mean that _because_ it's present, it should be brought
> to the committee for standardization? If so, what's the right path there?
>

The existence of the
--long-ugly-flag-name-that-says-we'll-remove-the-feature is the way we
currently try to avoid introducing a language dialect. If we remove that
flag as is proposed, then we are effectively relitigating the question of
whether to have the feature at all.

> > > However, there is a variant on what you're proposing that might fare
> > > better: instead of guaranteeing zero-initialization, we could guarantee
> > > that any observation of an uninitialized variable *either* gives
> produces
> > > zero or results in a trap. That is: it's still undefined to read from
> > > uninitialized variables -- we still do not guarantee what will happen
> if
> > > you do, and will warn on uninitialized uses and so on -- but we would
> bound
> > > the damage that can result from such accesses. You would get the
> security
> > > hardening benefits with the modest binary size impact. That approach
> would
> > > not introduce the risk of creating a language dialect (at least, not
> to the
> > > same extent), so our policy on avoiding language extensions would not
> apply.
>
> While I like the trap idea, I must say that I don't have a lot of
> confidence that it can be done in a way that would actually provide the
> same coverage benefits. How would such detection work? There are entire
> suites of tools (e.g. KMSan) that exist for this kind of thing and are not
> really suited for production use because it's so expensive to implement.
>

The idea is that we would eventually emit exactly the same code as with the
zero-init approach, except with some additional IR-level markers that would
allow control flow paths that always read an uninitialized value to be
optimized into a trap instruction. (For example: instead of initializing
with a literal zero, you initialize with a call to an intrinsic function.
If you can see that an instruction that's allowed to fault depends on a
value produced by such a function, then you replace it with a trap. And at
some stage of the optimization pipeline, you replace all such calls to that
intrinsic function with literal zeroes.)

We might still produce zeroes instead of traps almost all the time. I think
that's OK, so long as we trap enough that "you're using a variable that's
uninitialized" receives an "oops" response instead of an "it's OK, the
compiler will definitely zero it" response.

And indeed it might even be OK if the initial behavior is that we *always*
zero-initialize (as Philip asked), so long as our documentation clearly
says that we do not guarantee that the value will be zero (only that we
guarantee that *if the program continues*, the value will be zero), and our
intent is that we may still produce traps or otherwise abort the
computation.

> > I dont understand the point you're making here.  You're saying that if
> > Clang provides reliable behavior, that's a language extension and
> therefore
> > impossible; but if Clang provides behavior that unpredictably switches
> > between only two possible alternatives (zero or trap), then that's not a
> > language extension anymore and therefore is possible?
>
> Er, is that what was meant? I mean, don't we kind of already have this
> state in Clang already with "pattern" or "zero"? I'm just hoping to drop
> the -enable... flag.
>
> > I suspect many of the people quoted in the original quote-fest would not
> be
> > happy with "zero *or trap*" as the two behaviors.
> > What if you made it "zero *or one*"?  That is, whenever you access an
> > uninitialized variable, you are guaranteed to get either all-bits-zero or
> > else all-bits-zero-except-for-the-last-bit-which-is-1?  Would that
> > selection of two behaviors leave matters sufficiently unspecified so as
> to
> > dodge this "language extension" nonsense?
>
> Speaking with my kernel security flaw mitigation hat on, no, zero-init is
> the desired feature unless trap is equally or less expensive, in which
> case trap is great because then the kernel will warn and then set it
> to zero.
>
> --
> Kees Cook
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200421/5ba28093/attachment.html>