[cfe-dev] [RFC] automatic variable initialization

Wed Nov 28 14:28:09 PST 2018

On Wed, Nov 28, 2018 at 2:26 PM David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Wed, Nov 28, 2018 at 2:18 PM JF Bastien via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>>
>>
>> On Nov 28, 2018, at 11:08 AM, Kostya Serebryany via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>>
>>
>> On Tue, Nov 27, 2018 at 6:12 PM Richard Smith <richard at metafoo.co.uk>
>> wrote:
>>
>>> On Tue, 27 Nov 2018 at 11:52, Kostya Serebryany via cfe-dev <
>>> cfe-dev at lists.llvm.org> wrote:
>>>
>>>> On Tue, Nov 27, 2018 at 10:43 AM Sean McBride <sean at rogue-research.com>
>>>> wrote:
>>>>
>>>>> On Tue, 27 Nov 2018 10:19:03 -0800, Kostya Serebryany via cfe-dev said:
>>>>>
>>>>> >One more data point: among the bugs found by MSAN in Chrome over the
>>>>> past
>>>>> >few years 449 were uninitialized heap and 295 were uninitialized
>>>>> stack.
>>>>> >So, the proposed functionality would prevent ~40% (i.e. quite a bit!)
>>>>> of
>>>>> >all UUMs in software like Chrome.
>>>>>
>>>>> I just lurk here, but I think the proposed functionality would be
>>>>> greatly appreciated by C/C++/Obj-C developers on macOS, where
>>>>> MemorySanitizer is not supported and valgrind can't even launch TextEdit.
>>>>> If I'm not mistaken, it would be the *only* tool on macOS to catch UUMs.
>>>>>
>>>>>
>>>>
>>>> It won't catch anything -- but it will prevent the stack UUMs from
>>>> hurting you in production.
>>>>
>>>
>>> Well, it will prevent them from resulting in unbounded UB, yes, but
>>> that's not the only thing that hurts you.
>>>
>>
>> My statement above is applicable to both zero-initialize and
>> pattern-/random-intialize.
>>
>>
>>>
>>> A few years back I improved clang's -Wuninitialized and it found a few
>>> hundred bugs in one codebase. Of those, in only about half of the cases was
>>> the correct fix to zero-initialize; the uninitialized read was very often
>>> symptomatic of a logic bug in the function. Now suppose a compiler adds a
>>> flag to automatically zero-initialize. This will likely catch on, just like
>>> -fno-strict-aliasing did, because it makes it easier to write wrong code
>>> that appears to work, and there's a tendency to value code appearing to
>>> work more than you value it actually working. And before you know it, ~all
>>> large projects need to be built with that flag enabled all the time,
>>> because they depend on some code that expects uninitialized variables to be
>>> zero-initialized (say, in inline functions or templates). Now we lose an
>>> opportunity to catch lots of bugs (either at compile time or runtime), and
>>> the benefit is that we define away a similar number of bugs. I don't think
>>> it's clear that that's a good tradeoff.
>>>
>>
>> I have absolutely no disagreement with what you say here.
>> I'd love to pass the bikeshedding phase and get the performance numbers,
>> then come back to the discussion if the numbers show that zero-init is much
>> faster.
>>
>>
>>>
>>> Also, as others have noted, adding an "initialize to zero" flag will
>>> create an incompatible language dialect, just like -fno-strict-aliasing and
>>> -fno-exceptions and -fwrapv (etc) did. We have a general policy that we
>>> don't want to do that. Initializing to a pseudo-random or
>>> intentionally-chosen-to-often-trap bit-pattern seems fine to me, though,
>>> and an entirely reasonable security measure.
>>>
>>> Half-baked idea: what if we made it possible to enable a
>>> "zero-initialize all uninitialized variables" mode internally within the
>>> compiler but didn't expose it at all? (That way, you could turn this
>>> feature on with a compiler plugin, but a stock clang binary can't do it no
>>> matter what you write on the command line, unless you have such a plugin,
>>> which we don't ship with clang.)
>>>
>>
>> Interesting, but I don't know how easy it is to use a plugin in all
>> environments where we need to measure perf.
>> IMHO, a scary-named flag that we periodically change (more frequently
>> than every release) should work well enough.
>>
>>
>> Honestly this seems simple enough to support and use, and will clearly
>> break people relying on it inadvertently. We should assume our users are
>> adults and will get the message, and once we remove the option there should
>> be no surprise.
>>
>
> I'm not sure it's really a matter of people not being adults, and going on
> past experience there does seem to be a real risk that people would grow a
> dependence on such behavior.
>
> Seems easier to me to separate the two pieces - move ahead with the
> non-zero options, and separate the discussion on the zero option. You can
> present performance numbers from what you can measure without shipping a
> compiler with the feature - and if those numbers are sufficiently
> compelling compared to the risks of slicing the language, then perhaps we
> go that way.
>

This approach will significantly impair my ability to do the measurements I
need.

>
>
>> Is this the only remaining objection to the patch? I haven’t received
>> comments on much otherwise, it would be good to get reviews going.
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181128/1060f106/attachment.html>