[cfe-dev] [RFC] automatic variable initialization

Wed Nov 28 15:09:55 PST 2018

On Wed, Nov 28, 2018 at 2:59 PM JF Bastien <jfbastien at apple.com> wrote:

>
> On Nov 28, 2018, at 2:51 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Wed, Nov 28, 2018 at 2:32 PM JF Bastien <jfbastien at apple.com> wrote:
>
>>
>> On Nov 28, 2018, at 2:28 PM, Kostya Serebryany <kcc at google.com> wrote:
>>
>>
>>
>> On Wed, Nov 28, 2018 at 2:26 PM David Blaikie <dblaikie at gmail.com> wrote:
>>
>>>
>>>
>>> On Wed, Nov 28, 2018 at 2:18 PM JF Bastien via cfe-dev <
>>> cfe-dev at lists.llvm.org> wrote:
>>>
>>>>
>>>>
>>>> On Nov 28, 2018, at 11:08 AM, Kostya Serebryany via cfe-dev <
>>>> cfe-dev at lists.llvm.org> wrote:
>>>>
>>>>
>>>>
>>>> On Tue, Nov 27, 2018 at 6:12 PM Richard Smith <richard at metafoo.co.uk>
>>>> wrote:
>>>>
>>>>> On Tue, 27 Nov 2018 at 11:52, Kostya Serebryany via cfe-dev <
>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> On Tue, Nov 27, 2018 at 10:43 AM Sean McBride <
>>>>>> sean at rogue-research.com> wrote:
>>>>>>
>>>>>>> On Tue, 27 Nov 2018 10:19:03 -0800, Kostya Serebryany via cfe-dev
>>>>>>> said:
>>>>>>>
>>>>>>> >One more data point: among the bugs found by MSAN in Chrome over
>>>>>>> the past
>>>>>>> >few years 449 were uninitialized heap and 295 were uninitialized
>>>>>>> stack.
>>>>>>> >So, the proposed functionality would prevent ~40% (i.e. quite a
>>>>>>> bit!) of
>>>>>>> >all UUMs in software like Chrome.
>>>>>>>
>>>>>>> I just lurk here, but I think the proposed functionality would be
>>>>>>> greatly appreciated by C/C++/Obj-C developers on macOS, where
>>>>>>> MemorySanitizer is not supported and valgrind can't even launch TextEdit.
>>>>>>> If I'm not mistaken, it would be the *only* tool on macOS to catch UUMs.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> It won't catch anything -- but it will prevent the stack UUMs from
>>>>>> hurting you in production.
>>>>>>
>>>>>
>>>>> Well, it will prevent them from resulting in unbounded UB, yes, but
>>>>> that's not the only thing that hurts you.
>>>>>
>>>>
>>>> My statement above is applicable to both zero-initialize and
>>>> pattern-/random-intialize.
>>>>
>>>>
>>>>>
>>>>> A few years back I improved clang's -Wuninitialized and it found a few
>>>>> hundred bugs in one codebase. Of those, in only about half of the cases was
>>>>> the correct fix to zero-initialize; the uninitialized read was very often
>>>>> symptomatic of a logic bug in the function. Now suppose a compiler adds a
>>>>> flag to automatically zero-initialize. This will likely catch on, just like
>>>>> -fno-strict-aliasing did, because it makes it easier to write wrong code
>>>>> that appears to work, and there's a tendency to value code appearing to
>>>>> work more than you value it actually working. And before you know it, ~all
>>>>> large projects need to be built with that flag enabled all the time,
>>>>> because they depend on some code that expects uninitialized variables to be
>>>>> zero-initialized (say, in inline functions or templates). Now we lose an
>>>>> opportunity to catch lots of bugs (either at compile time or runtime), and
>>>>> the benefit is that we define away a similar number of bugs. I don't think
>>>>> it's clear that that's a good tradeoff.
>>>>>
>>>>
>>>> I have absolutely no disagreement with what you say here.
>>>> I'd love to pass the bikeshedding phase and get the performance
>>>> numbers, then come back to the discussion if the numbers show that
>>>> zero-init is much faster.
>>>>
>>>>
>>>>>
>>>>> Also, as others have noted, adding an "initialize to zero" flag will
>>>>> create an incompatible language dialect, just like -fno-strict-aliasing and
>>>>> -fno-exceptions and -fwrapv (etc) did. We have a general policy that we
>>>>> don't want to do that. Initializing to a pseudo-random or
>>>>> intentionally-chosen-to-often-trap bit-pattern seems fine to me, though,
>>>>> and an entirely reasonable security measure.
>>>>>
>>>>> Half-baked idea: what if we made it possible to enable a
>>>>> "zero-initialize all uninitialized variables" mode internally within the
>>>>> compiler but didn't expose it at all? (That way, you could turn this
>>>>> feature on with a compiler plugin, but a stock clang binary can't do it no
>>>>> matter what you write on the command line, unless you have such a plugin,
>>>>> which we don't ship with clang.)
>>>>>
>>>>
>>>> Interesting, but I don't know how easy it is to use a plugin in all
>>>> environments where we need to measure perf.
>>>> IMHO, a scary-named flag that we periodically change (more frequently
>>>> than every release) should work well enough.
>>>>
>>>>
>>>> Honestly this seems simple enough to support and use, and will clearly
>>>> break people relying on it inadvertently. We should assume our users are
>>>> adults and will get the message, and once we remove the option there should
>>>> be no surprise.
>>>>
>>>
>>> I'm not sure it's really a matter of people not being adults, and going
>>> on past experience there does seem to be a real risk that people would grow
>>> a dependence on such behavior.
>>>
>>> Seems easier to me to separate the two pieces - move ahead with the
>>> non-zero options, and separate the discussion on the zero option. You can
>>> present performance numbers from what you can measure without shipping a
>>> compiler with the feature - and if those numbers are sufficiently
>>> compelling compared to the risks of slicing the language, then perhaps we
>>> go that way.
>>>
>>
>> This approach will significantly impair my ability to do the measurements
>> I need.
>>
>>
>> Same, and I don’t intend to publish all the numbers that I can measure
>> for the usual reasons.
>>
>
> Not suggesting you'd need to public all the numbers - though anyone
> driving the feature development would rebuild their compiler, gather their
> own numbers, and work on any places where there are regressions. This
> doesn't seem like it would make a significant blocker to me?
>
>
> For sure, same as for Google building its own clang it’s easy for us to
> set a build flag and enable this option internally. All I’m trying to
> communicate is that my and your code aren’t representative of ~all clang
> uses, and what’s easy for us it’s for others. Sure I wrote this patch for
> Apple but I’d like it to be useful for the LLVM community as a whole (and
> that’s not just Apple + Google!). If the community is OK with “Apple and
> Google are happy with perf, so that’s good enough for us”, then let’s go
> with hiding the options behind a cmake flag.
>

I think maybe a different way to look at this is - there are users (who are
willing to maintain the feature in the compiler, etc) who want the tradeoff
this provides (performance versus the risk their users start depending on
zero init). That's enough to justify the feature, I think - whether or not
there's another related feature (zero init) implemented or not.

>
>
> Further, I expect that other people’s codebases will behave differently,
>> and I want those other people to be able to do their own measurements.
>>
>
> I assume you're at least primarily motivated by your own business needs -
> that tends to be how these things work. So if you implemented this & got to
> the point where, for your needs, non-zero init had sufficient quality -
> wouldn't that be sufficient?
>
>
>> I want to make their life easier, because the faster we iterate and
>> remove performance issues, the faster we can get rid of zero-init because
>> we trust that people’s experiments are representative enough to prove that
>> there’s no significant performance cost left.
>>
>> By making measurements harder we’re hiding the actual cost of this
>> feature, and keeping alive the myth that zero-init is better.
>>
>
> Why would that myth matter? Either the performance cost of the feature is
> worth it or it isn't, right? "here's a flag (non-zero init) that may help
> you avoid some security bugs, it has this performance impact" - people
> evaluate whether that cost is worthwhile regardless of whether there's some
> other hypothetical feature - it either is or it isn't worth it.
>
>
> Because compiler writer are often accused of playing dumb and shipping
> something that they think is good enough,
>

Not sure that's the case here, though - I assume you're implementing this
for your users (internal, external, someone), and so you wouldn't implement
it if it wasn't the right tradeoff for them, right? So it's not playing
dumb, or thinking it's good enough - it actually is good enough for some
users you're implementing it for, right?

> not something that the developer thinks is good. Some people perceive this
> security feature as more onerous than need be, and the only way to convince
> them otherwise is to let them run the numbers. In some cases they’re right,
> and we need data to remedy the situation.
>

Do we need to convince them, though? Generally we implement stuff we each
need - already have specific users/captive audience.

Admittedly Apple's a bit different from Google here, as they have external
customers which do have more of the issues you described (& you may want to
convince them to use the feature to improve the security of your platform,
etc) - and if you felt the tradeoff/risk for your platform was worthwhile,
you could release the Apple Clang builds with the feature enabled. I
wouldn't, but you could & reasonable folks can disagree about that &
continue to get along :)

>
>
> By changing the flag name all the time we’ll make it hard enough to rely
>> on the flag for its semantics, and it’s then trivial to remove the flag.
>>
>
> I'm a sign, not a cop - but I'm also trying to communicate/describe what I
> think other folks have expressed in this thread.
>
> In any case, it still doesn't seem critical that the decision about
> supporting zero-init be a blocker to implementing the non-zero init
> feature. I'd really encourage them to be treated separately, moving ahead
> with whichever variants of non-zero init people want to experiment with (&
> maybe with a less contentious zero-init (eg: one behind build-time
> configuration option) mode - while you can continue to advocate
> for/discuss/work for making that option being on-by-default)
>
> But I'll leave it at that - I feel like everyone understands each others
> points & I'm not going to say (I don't feel I'm in a position of authority
> on this - I'll throw Richard Smith under the bus for that - ultimately it's
> a change to Clang's surface area & Richard owns Clang) you can't do it.
>
> - Dave
>
>
>>
>>
>> Is this the only remaining objection to the patch? I haven’t received
>>>> comments on much otherwise, it would be good to get reviews going.
>>>>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181128/0ff70940/attachment.html>