[cfe-dev] [EXTERNAL] Re: making -ftrivial-auto-var-init=zero a first-class option

Richard Smith via cfe-dev cfe-dev at lists.llvm.org
Fri May 1 11:42:57 PDT 2020


On Wed, 29 Apr 2020 at 18:17, JF Bastien via cfe-dev <cfe-dev at lists.llvm.org>
wrote:

> I’ve consulted with folks in security, compilers, and developers of
> security-sensitive codebases. A few points:
>
>
>    - They like that automatic variable initialization provides a security
>    mitigation for a significant percentage of observed zero-day exploits, at
>    extremely low cost, with little chance of regression.
>    - They like that pattern initialization is a “smoking gun” when seen
>    in a crash log. The size+performance costs have decreased in the last year,
>    but they’d like to see it improve further.
>    - They like the lower size+performance cost of zero initialization as
>    well as the safety it offers for e.g. size variables (because a size of
>    0xAA…AA is “infinite” which is bad for bounds checking, whereas zero
>    isn’t). They don’t like that zero is often a valid pointer sentinel value
>    (i.e. initializing pointers to zero can be used in unexpected data flow).
>    They don’t like the silly long compiler flag.
>    - We’ve deployed automatic variable initialization in a significant
>    amount of code. The vast majority of our deployment uses pattern
>    initialization. A small number uses zero, of which you’ll note XNU
>    <https://opensource.apple.com/source/xnu/xnu-6153.11.26/makedefs/MakeInc.def.auto.html>.
>    We’ve only chosen zero in cases where size or performance were measured
>    issues.
>    - Automatic variable initialization which might sometimes trap (as
>    Richard suggests) is a proposal we’d like to see implemented, but we’d like
>    to see it under its own separate flag, something like UBSan does with
>    developers choosing trapping or logging modes. The main reason is that pure
>    trapping with zero-init will make deployment significantly harder (it’s no
>    longer a low-risk mitigation), and it’ll make updating our compiler
>    significantly harder (because it might change where it generates traps over
>    time). We also think that trapping behavior would need good tooling, for
>    example through opt remarks, to help find and fix parts of the code where
>    the compiler added traps. A logging mode would ease some of this burden. As
>    well, we’re not convinced on the size+performance cost of either tapping
>    nor logging, complicating the adoption of the mitigation.
>
> I threw together an implementation here: https://reviews.llvm.org/D79249

It's pretty quick and dirty but it seems to do the right thing on at least
a small selection of test cases. I've not tried it on any nontrivial
codebases yet. (I'm not sure in what ways it doesn't work, but at least the
InstCombine approach seems likely to fight with other InstCombines.)

Just a few points of my own on the topic of trap-vs-init:

 * Once we agree that we want to harden against uninitialized uses, we're
out of the security space entirely. The question of whether we would prefer
a crash or a program somehow keeping going after hitting undefined behavior
(absent a security bug) is a software engineering question, not a security
question, if we agree that they provide the same security mitigation, so if
the only users you asked are security-focused ones, you have sampling bias.
 * As I understand it, automatic initialization to zero or to
pattern-with-high-bits-set were chosen, in part, because they are very
likely to lead to clean crashes. Given that, it doesn't really make sense
to me to be concerned about the "trap" risk of the mitigation introducing
new crashes, since crashing on bad programs was already part of the goal.
 * The performance of the trapping mode is certainly unproven, but if the
trapping mode doesn't introduce new branches and the "potentially trap"
markers are removed early enough to not get in the way of other
optimizations, it's not obvious to me that there should be any systematic
effect.
 * The size of the trapping mode is likewise unproven, but if it only ever
replaces a branch destination with a trap, it seems plausible to me that it
could *reduce* code size compared to the zeroing mode.
 * In the presence of a bug, "crash early, crash often" is, in (empirically
and subjectively) most software domains, the right answer -- continuing
after your program's invariants are not met is not sound software
engineering practice. But whether we do continue in such cases is exactly
the difference between "zero" and "zero-or-maybe-trap". Programs that
really need to be robust against things going wrong, and recover in some
way, typically install a SIGSEGV and SIGABRT handler anyway. It's probably
better to jump to those handlers rather than continue with broken
invariants and risk hitting more problems (maybe security problems) later
on.
 * Per
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html
warning or producing opt remarks when the optimizer inserts a trap is not
likely to be all that useful. (I think opt remarks would be OK, but I also
think it's going to be hard to expose them in a way that helps the user
understand what happened well enough to know if they're false positives.)
Places where traps are added don't necessarily need to be fixed; the case
in question might be unreachable due to program invariants in a way the
optimizer can't determine, and in those cases, the remarks will just be
non-useful noise. Perhaps a project could keep track of newly-introduced
remarks and ask the developer to take a look at them though?

If we want a separate flag to control whether / how much we use such a
trapping mode, I think that could be reasonable, subject to having a good
answer to the language dialect concern I expressed previously (eg, maybe
there's never a guarantee that we won't insert a trap, even though with the
flag on its lowest setting that is actually what happens).

>
>    - We don’t think the design space has been explored enough. We might
>    want to differentiate initialization more than just “floating point is
>    0xFF…FF, everything else is 0xAA…AA”. For example:
>       - We could pattern-init pointers (maybe with compile-time random
>       different patterns), and zero-init scalars. This has a good mix of
>       performance and security upsides.
>       - We could key off heuristics to choose how to initialize, such as
>       variable names, function names, or some machine learning model (and for
>       those who know me: I’m not joking).
>       - We could use a variety of runtime pseudo-random sources to
>       initialize values.
>       - We could add a new IR “undef” type, or different late-stage
>       treatment of “undef”, to apply initialization after optimizations have
>       found interesting facts about the program.
>
>
> We’d like to see work continue on improving this mitigation, optimizations
> around it, and other similar mitigations.
>

+1. =)

> On Apr 22, 2020, at 1:55 PM, Kees Cook via cfe-dev <cfe-dev at lists.llvm.org>
> wrote:
>
> On Wed, Apr 22, 2020 at 01:08:03PM -0700, Richard Smith wrote:
>
> On Wed, 22 Apr 2020 at 10:49, Joe Bialek <jobialek at microsoft.com> wrote:
>
> Also not clear to me what the OS is expected to do with this trap. We have
> a number of information leak vulnerabilities where force initialization
> kills the bug silently.
>
>
> Do you really mean "kills the bug"? I would certainly believe you have a
> number of information leak vulnerabilities where zero-init fixes the
> *vulnerability* (and we should definitely provide tools to harden programs
> against such vulnerabilities), but the program is still using an
> uninitialized value and still has a bug. The idea that this compiler change
> fixes or removes the bug is precisely the language dialect problem that I'm
> concerned about. Developers must still think that reading an uninitialized
> value is a bug (even if it's not a vulnerability any more) or they're
> writing a program in a language dialect where doing that is not a bug.
>
>
> Yeah, this is another "different communities mean different things"
> terminology glitch. For the security folks, "bug" tends to stand in for
> "security bug" or "security flaw". But yes, as you say, the "bug"
> (misuse of the C language) is present, but the "security flaw" gets
> downgraded to "just a bug" in the zero-init case. :)
>
> --
> Kees Cook
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200501/397013e2/attachment-0001.html>


More information about the cfe-dev mailing list