[cfe-dev] [EXTERNAL] Re: making -ftrivial-auto-var-init=zero a first-class option

Mon May 4 07:52:14 PDT 2020

> On Apr 30, 2020, at 02:16, JF Bastien via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> 
> I’ve consulted with folks in security, compilers, and developers of security-sensitive codebases. A few points:
> 
> They like that automatic variable initialization provides a security mitigation for a significant percentage of observed zero-day exploits, at extremely low cost, with little chance of regression.
> They like that pattern initialization is a “smoking gun” when seen in a crash log. The size+performance costs have decreased in the last year, but they’d like to see it improve further.
> They like the lower size+performance cost of zero initialization as well as the safety it offers for e.g. size variables (because a size of 0xAA…AA is “infinite” which is bad for bounds checking, whereas zero isn’t). They don’t like that zero is often a valid pointer sentinel value (i.e. initializing pointers to zero can be used in unexpected data flow). They don’t like the silly long compiler flag.
> We’ve deployed automatic variable initialization in a significant amount of code. The vast majority of our deployment uses pattern initialization. A small number uses zero, of which you’ll note XNU <https://opensource.apple.com/source/xnu/xnu-6153.11.26/makedefs/MakeInc.def.auto.html>. We’ve only chosen zero in cases where size or performance were measured issues.
> Automatic variable initialization which might sometimes trap (as Richard suggests) is a proposal we’d like to see implemented, but we’d like to see it under its own separate flag, something like UBSan does with developers choosing trapping or logging modes. The main reason is that pure trapping with zero-init will make deployment significantly harder (it’s no longer a low-risk mitigation), and it’ll make updating our compiler significantly harder (because it might change where it generates traps over time). We also think that trapping behavior would need good tooling, for example through opt remarks, to help find and fix parts of the code where the compiler added traps. A logging mode would ease some of this burden. As well, we’re not convinced on the size+performance cost of either tapping nor logging, complicating the adoption of the mitigation.
> We don’t think the design space has been explored enough. We might want to differentiate initialization more than just “floating point is 0xFF…FF, everything else is 0xAA…AA”. For example:
> We could pattern-init pointers (maybe with compile-time random different patterns), and zero-init scalars. This has a good mix of performance and security upsides.
> We could key off heuristics to choose how to initialize, such as variable names, function names, or some machine learning model (and for those who know me: I’m not joking).
> We could use a variety of runtime pseudo-random sources to initialize values.
> We could add a new IR “undef” type, or different late-stage treatment of “undef”, to apply initialization after optimizations have found interesting facts about the program.
> 
> We’d like to see work continue on improving this mitigation, optimizations around it, and other similar mitigations.

On the optimization side, work on cross-basicblock DSE using MemorySSA is currently underway. The goal is to remove a major limitation of the current DSE implementation. A few initial patches landed already, but there’s still plenty of work to do. I’ve created https://bugs.llvm.org/show_bug.cgi?id=45792 to collect information, in case anyone is interested in working on this.

Cheers,
Florian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200504/20e0372e/attachment.html>