<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 4/21/20 3:11 PM, Richard Smith via
cfe-dev wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAOfiQqng0WgkKdoa_oFsSyW3a5cqa_0gC6YAo9P1+3EZNsaOQQ@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">What you're proposing is, without question, a
language extension. Our policy on language extensions is
documented here: <a
href="http://clang.llvm.org/get_involved.html"
moz-do-not-send="true">http://clang.llvm.org/get_involved.html</a>
<div><br>
</div>
<div>Right now, this fails at point 4. We do not want to create
or encourage the creation of language dialects and
non-portable code, so the place to have this discussion is in
the C and C++ committees. Both committees have processes for
specifying optional features these days, and they might be
amenable to using those processes to standardize the behavior
you're asking for. (I mean, maybe not, but our policy requires
that you at least try.)</div>
<div><br>
</div>
<div>However, there is a variant on what you're proposing that
might fare better: instead of guaranteeing
zero-initialization, we could guarantee that any observation
of an uninitialized variable *either* gives produces zero or
results in a trap. That is: it's still undefined to read from
uninitialized variables -- we still do not guarantee what will
happen if you do, and will warn on uninitialized uses and so
on -- but we would bound the damage that can result from such
accesses. You would get the security hardening benefits with
the modest binary size impact. That approach would not
introduce the risk of creating a language dialect (at least,
not to the same extent), so our policy on avoiding language
extensions would not apply.</div>
</div>
</blockquote>
<p>Richard, just to check here, it sounds to me like you're raising
more a point of specification then of implementation right? That
is, you're not stating that the actual implementation must
sometimes trap (when producing a zero wouldn't), but that the
specification of the flags and docs must leave the possibility
there of?</p>
<p>If I'm complete misinterpreting, please just say so. I don't
want to start a tangent discussion here, I just spotted what sound
like it could be a "quick fix" which lets to OP achieve their
objective and wanted to call it out if in fact I'd read correctly.</p>
<p>Philip<br>
</p>
<blockquote type="cite"
cite="mid:CAOfiQqng0WgkKdoa_oFsSyW3a5cqa_0gC6YAo9P1+3EZNsaOQQ@mail.gmail.com"><br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue, 21 Apr 2020 at 14:21,
Kees Cook via cfe-dev <<a
href="mailto:cfe-dev@lists.llvm.org" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
<br>
tl;dr: I'd like to revisit making -ftrivial-auto-var-init=zero
an expressly<br>
supported option. To do this, I think we need to either
entirely remove<br>
"-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang"<br>
or rename it to something more directly reflecting the issue,
like<br>
"-enable-trivial-auto-var-init-zero-knowing-it-forks-the-language".<br>
<br>
This is currently open as <a
href="https://bugs.llvm.org/show_bug.cgi?id=45497"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://bugs.llvm.org/show_bug.cgi?id=45497</a><br>
<br>
Here is the situation: -ftrivial-auto-var-init=pattern is
great for<br>
debugging, but -ftrivial-auto-var-init=zero is needed for
production<br>
systems for mainly two reasons, each of which I will try to
express context<br>
for:<br>
<br>
1) performance and size<br>
<br>
As measured by various Google folks across a few projects and
in<br>
various places, there's a fairly significant performance
impact of<br>
using pattern-init over zero-init. I can let other folks chime
in<br>
with their exact numbers, but I can at least share some
measurements<br>
Alexander Potapenko made with the Linux kernel (see
"Performance costs"):<br>
<a
href="https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf</a><br>
tl;dr: zero-init tended to be half the cost of pattern-init,
though it<br>
varied based on workload, and binary size impact fell over 95%
going<br>
from pattern-init to zero-init.<br>
<br>
2) security<br>
<br>
Another driving factor (see below from various
vendors/projects), is the<br>
security stance. Putting non-zero values into most variables
types ends<br>
up making them arguably more dangerous than if they were
zero-filled.<br>
Most notably, sizes and indexes and less likely to be used out
of bounds<br>
if they are zero-initialized. The same holds for bool values
that tend<br>
to indicate success instead of failing safe with a false
value. While<br>
pointers in the non-canonical range are nice, zero tends to be
just<br>
as good. There are certainly exceptions here, but the bulk of
the<br>
historical record on how "uninitialized" variables have been
used in<br>
real world exploitation involve their being non-zero, and
analysis of<br>
those bugs support that conclusion.<br>
<br>
<br>
Various positions from vendors and projects:<br>
<br>
Google (Android, Chrome OS)<br>
<br>
Both Android and Chrome OS initially started using
pattern-init, but due<br>
to each of: the performance characteristics, the binary size
changes, and<br>
the less robust security stance, both projects have recently
committed<br>
to switching to zero-init.<br>
<br>
<br>
Microsoft (Windows)<br>
<br>
I'm repeating what Joe Bialek has told me, so he can clarify
if I'm not<br>
representing this correctly... While not using Clang/LLVM,
Microsoft is<br>
part of the larger C/C++ ecosystem and has implemented both
zero-init<br>
(for production builds) and pattern-init (for debug builds) in
their<br>
compiler too. They also chose zero-init for production
expressly due<br>
to the security benefits.<br>
<br>
Some details of their work:<br>
<a
href="https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf</a><br>
<br>
<br>
Upstream Linux kernel<br>
<br>
Linus Torvalds has directly stated that he wants zero-init:<br>
"So I'd like the zeroing of local variables to be a native
compiler<br>
option..."<br>
"This, btw, is why I also think that the "initialize with
poison" is<br>
pointless and wrong."<br>
<a
href="https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/</a><br>
Unsurprisingly, I strongly agree. ;)<br>
<br>
<br>
GrapheneOS is using zero-init (rather than patching Clang as
it used to, to get<br>
the same result):<br>
<a
href="https://twitter.com/DanielMicay/status/1248384468181643272"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://twitter.com/DanielMicay/status/1248384468181643272</a><br>
<br>
<br>
GCC<br>
There's been mostly silence on the entire topic of automatic
variable<br>
initialization, though there have been patches proposed in the
past for<br>
zero-init:<br>
<a
href="https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html</a><br>
<br>
<br>
Apple<br>
<br>
I can't speak meaningfully here, but I've heard rumors that
they are<br>
depending on zero-init as well. Perhaps someone there can
clarify how<br>
they are using these features?<br>
<br>
<br>
<br>
So, while I understand the earlier objections to zero-init
from a<br>
"language fork" concern, I think this isn't a position that
can really<br>
stand up to the reality of how many projects are using the
feature (even<br>
via non-Clang compilers). Given that so much code is going to
be built<br>
using zero-init, what's the best way for Clang to adapt here?
I would<br>
prefer to just drop the -enable... option entirely, but I
think just<br>
renaming it would be fine too.<br>
<br>
Thoughts/flames? ;)<br>
<br>
-- <br>
Kees Cook<br>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank"
moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br>
<a
href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
cfe-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>
</pre>
</blockquote>
</body>
</html>