[cfe-dev] [RFC] automatic variable initialization

Wed Nov 28 14:42:04 PST 2018

On Wed, Nov 28, 2018 at 2:28 PM Kostya Serebryany <kcc at google.com> wrote:

> On Wed, Nov 28, 2018 at 2:26 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>>
>>
>> On Wed, Nov 28, 2018 at 2:18 PM JF Bastien via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>>>
>>>
>>> On Nov 28, 2018, at 11:08 AM, Kostya Serebryany via cfe-dev <
>>> cfe-dev at lists.llvm.org> wrote:
>>>
>>>
>>>
>>> On Tue, Nov 27, 2018 at 6:12 PM Richard Smith <richard at metafoo.co.uk>
>>> wrote:
>>>
>>>> On Tue, 27 Nov 2018 at 11:52, Kostya Serebryany via cfe-dev <
>>>> cfe-dev at lists.llvm.org> wrote:
>>>>
>>>>> On Tue, Nov 27, 2018 at 10:43 AM Sean McBride <sean at rogue-research.com>
>>>>> wrote:
>>>>>
>>>>>> On Tue, 27 Nov 2018 10:19:03 -0800, Kostya Serebryany via cfe-dev
>>>>>> said:
>>>>>>
>>>>>> >One more data point: among the bugs found by MSAN in Chrome over the
>>>>>> past
>>>>>> >few years 449 were uninitialized heap and 295 were uninitialized
>>>>>> stack.
>>>>>> >So, the proposed functionality would prevent ~40% (i.e. quite a
>>>>>> bit!) of
>>>>>> >all UUMs in software like Chrome.
>>>>>>
>>>>>> I just lurk here, but I think the proposed functionality would be
>>>>>> greatly appreciated by C/C++/Obj-C developers on macOS, where
>>>>>> MemorySanitizer is not supported and valgrind can't even launch TextEdit.
>>>>>> If I'm not mistaken, it would be the *only* tool on macOS to catch UUMs.
>>>>>>
>>>>>>
>>>>>
>>>>> It won't catch anything -- but it will prevent the stack UUMs from
>>>>> hurting you in production.
>>>>>
>>>>
>>>> Well, it will prevent them from resulting in unbounded UB, yes, but
>>>> that's not the only thing that hurts you.
>>>>
>>>
>>> My statement above is applicable to both zero-initialize and
>>> pattern-/random-intialize.
>>>
>>>
>>>>
>>>> A few years back I improved clang's -Wuninitialized and it found a few
>>>> hundred bugs in one codebase. Of those, in only about half of the cases was
>>>> the correct fix to zero-initialize; the uninitialized read was very often
>>>> symptomatic of a logic bug in the function. Now suppose a compiler adds a
>>>> flag to automatically zero-initialize. This will likely catch on, just like
>>>> -fno-strict-aliasing did, because it makes it easier to write wrong code
>>>> that appears to work, and there's a tendency to value code appearing to
>>>> work more than you value it actually working. And before you know it, ~all
>>>> large projects need to be built with that flag enabled all the time,
>>>> because they depend on some code that expects uninitialized variables to be
>>>> zero-initialized (say, in inline functions or templates). Now we lose an
>>>> opportunity to catch lots of bugs (either at compile time or runtime), and
>>>> the benefit is that we define away a similar number of bugs. I don't think
>>>> it's clear that that's a good tradeoff.
>>>>
>>>
>>> I have absolutely no disagreement with what you say here.
>>> I'd love to pass the bikeshedding phase and get the performance numbers,
>>> then come back to the discussion if the numbers show that zero-init is much
>>> faster.
>>>
>>>
>>>>
>>>> Also, as others have noted, adding an "initialize to zero" flag will
>>>> create an incompatible language dialect, just like -fno-strict-aliasing and
>>>> -fno-exceptions and -fwrapv (etc) did. We have a general policy that we
>>>> don't want to do that. Initializing to a pseudo-random or
>>>> intentionally-chosen-to-often-trap bit-pattern seems fine to me, though,
>>>> and an entirely reasonable security measure.
>>>>
>>>> Half-baked idea: what if we made it possible to enable a
>>>> "zero-initialize all uninitialized variables" mode internally within the
>>>> compiler but didn't expose it at all? (That way, you could turn this
>>>> feature on with a compiler plugin, but a stock clang binary can't do it no
>>>> matter what you write on the command line, unless you have such a plugin,
>>>> which we don't ship with clang.)
>>>>
>>>
>>> Interesting, but I don't know how easy it is to use a plugin in all
>>> environments where we need to measure perf.
>>> IMHO, a scary-named flag that we periodically change (more frequently
>>> than every release) should work well enough.
>>>
>>>
>>> Honestly this seems simple enough to support and use, and will clearly
>>> break people relying on it inadvertently. We should assume our users are
>>> adults and will get the message, and once we remove the option there should
>>> be no surprise.
>>>
>>
>> I'm not sure it's really a matter of people not being adults, and going
>> on past experience there does seem to be a real risk that people would grow
>> a dependence on such behavior.
>>
>> Seems easier to me to separate the two pieces - move ahead with the
>> non-zero options, and separate the discussion on the zero option. You can
>> present performance numbers from what you can measure without shipping a
>> compiler with the feature - and if those numbers are sufficiently
>> compelling compared to the risks of slicing the language, then perhaps we
>> go that way.
>>
>
> This approach will significantly impair my ability to do the measurements
> I need.
>

I'm aware waht I'm proposing would make it more difficult for some people
to take measurements - that's a tradeoff to be sure - one where I err in
this direction.

Specifically for Google though - would it be that difficult for Google to
opt-in to a certain build configuration of LLVM? We/Google do build the
compiler from scratch, I assume we pick the configuration options we build
with & some of them probably aren't the defaults for a release build of
LLVM. So if it was important that Google's production compiler had these
features enabled (rather than building a test compiler for running some
experiments), that doesn't seem (at least to me, at this moment) especially
prohibitive, is it?

>
>
>>
>>
>>> Is this the only remaining objection to the patch? I haven’t received
>>> comments on much otherwise, it would be good to get reviews going.
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181128/f972439f/attachment.html>