<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 4/21/20 4:59 PM, Richard Smith
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAOfiQqmJFzpRtmLGjUZmE4g5z3qCNkcvHEdp3g5NMde9mv4BPA@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr">On Tue, 21 Apr 2020 at 16:49, Philip Reames via
cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org"
moz-do-not-send="true">cfe-dev@lists.llvm.org</a>> wrote:<br>
</div>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p>On 4/21/20 3:11 PM, Richard Smith via cfe-dev wrote:<br>
</p>
<blockquote type="cite">
<div dir="ltr">What you're proposing is, without
question, a language extension. Our policy on language
extensions is documented here: <a
href="http://clang.llvm.org/get_involved.html"
target="_blank" moz-do-not-send="true">http://clang.llvm.org/get_involved.html</a>
<div><br>
</div>
<div>Right now, this fails at point 4. We do not want
to create or encourage the creation of language
dialects and non-portable code, so the place to have
this discussion is in the C and C++ committees. Both
committees have processes for specifying optional
features these days, and they might be amenable to
using those processes to standardize the behavior
you're asking for. (I mean, maybe not, but our
policy requires that you at least try.)</div>
<div><br>
</div>
<div>However, there is a variant on what you're
proposing that might fare better: instead of
guaranteeing zero-initialization, we could guarantee
that any observation of an uninitialized variable
*either* gives produces zero or results in a trap.
That is: it's still undefined to read from
uninitialized variables -- we still do not guarantee
what will happen if you do, and will warn on
uninitialized uses and so on -- but we would bound
the damage that can result from such accesses. You
would get the security hardening benefits with the
modest binary size impact. That approach would not
introduce the risk of creating a language dialect
(at least, not to the same extent), so our policy on
avoiding language extensions would not apply.</div>
</div>
</blockquote>
<p>Richard, just to check here, it sounds to me like
you're raising more a point of specification then of
implementation right? That is, you're not stating that
the actual implementation must sometimes trap (when
producing a zero wouldn't), but that the specification
of the flags and docs must leave the possibility there
of?</p>
</div>
</blockquote>
<div>Well, I think it's not sufficient to merely say that we
might do something like trap, if our intent is that we never
will. We would need to reasonably agree that (for example)
if someone came forward with a patch that actually
implemented said trapping behavior and didn't introduce any
significant code size or performance impact, that we would
consider such a change to be a quality of implementation
improvement. But I don't think we need anyone to have
actually committed themselves to producing such a patch, or
any timeline or expectation of when (or indeed whether) it
would be done. Sorry if this is splitting a hair, but I
think it's an important hair to split. <br>
</div>
</div>
</div>
</blockquote>
Hair successfully split. I agree it is a key distinction.<br>
<blockquote type="cite"
cite="mid:CAOfiQqmJFzpRtmLGjUZmE4g5z3qCNkcvHEdp3g5NMde9mv4BPA@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p>If I'm complete misinterpreting, please just say so. I
don't want to start a tangent discussion here, I just
spotted what sound like it could be a "quick fix" which
lets to OP achieve their objective and wanted to call it
out if in fact I'd read correctly.</p>
<p>Philip<br>
</p>
<blockquote type="cite"><br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue, 21 Apr 2020
at 14:21, Kees Cook via cfe-dev <<a
href="mailto:cfe-dev@lists.llvm.org"
target="_blank" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px
0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">Hi,<br>
<br>
tl;dr: I'd like to revisit making
-ftrivial-auto-var-init=zero an expressly<br>
supported option. To do this, I think we need to
either entirely remove<br>
"-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang"<br>
or rename it to something more directly reflecting
the issue, like<br>
"-enable-trivial-auto-var-init-zero-knowing-it-forks-the-language".<br>
<br>
This is currently open as <a
href="https://bugs.llvm.org/show_bug.cgi?id=45497"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://bugs.llvm.org/show_bug.cgi?id=45497</a><br>
<br>
Here is the situation:
-ftrivial-auto-var-init=pattern is great for<br>
debugging, but -ftrivial-auto-var-init=zero is
needed for production<br>
systems for mainly two reasons, each of which I will
try to express context<br>
for:<br>
<br>
1) performance and size<br>
<br>
As measured by various Google folks across a few
projects and in<br>
various places, there's a fairly significant
performance impact of<br>
using pattern-init over zero-init. I can let other
folks chime in<br>
with their exact numbers, but I can at least share
some measurements<br>
Alexander Potapenko made with the Linux kernel (see
"Performance costs"):<br>
<a
href="https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf</a><br>
tl;dr: zero-init tended to be half the cost of
pattern-init, though it<br>
varied based on workload, and binary size impact
fell over 95% going<br>
from pattern-init to zero-init.<br>
<br>
2) security<br>
<br>
Another driving factor (see below from various
vendors/projects), is the<br>
security stance. Putting non-zero values into most
variables types ends<br>
up making them arguably more dangerous than if they
were zero-filled.<br>
Most notably, sizes and indexes and less likely to
be used out of bounds<br>
if they are zero-initialized. The same holds for
bool values that tend<br>
to indicate success instead of failing safe with a
false value. While<br>
pointers in the non-canonical range are nice, zero
tends to be just<br>
as good. There are certainly exceptions here, but
the bulk of the<br>
historical record on how "uninitialized" variables
have been used in<br>
real world exploitation involve their being
non-zero, and analysis of<br>
those bugs support that conclusion.<br>
<br>
<br>
Various positions from vendors and projects:<br>
<br>
Google (Android, Chrome OS)<br>
<br>
Both Android and Chrome OS initially started using
pattern-init, but due<br>
to each of: the performance characteristics, the
binary size changes, and<br>
the less robust security stance, both projects have
recently committed<br>
to switching to zero-init.<br>
<br>
<br>
Microsoft (Windows)<br>
<br>
I'm repeating what Joe Bialek has told me, so he can
clarify if I'm not<br>
representing this correctly... While not using
Clang/LLVM, Microsoft is<br>
part of the larger C/C++ ecosystem and has
implemented both zero-init<br>
(for production builds) and pattern-init (for debug
builds) in their<br>
compiler too. They also chose zero-init for
production expressly due<br>
to the security benefits.<br>
<br>
Some details of their work:<br>
<a
href="https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf</a><br>
<br>
<br>
Upstream Linux kernel<br>
<br>
Linus Torvalds has directly stated that he wants
zero-init:<br>
"So I'd like the zeroing of local variables to be a
native compiler<br>
option..."<br>
"This, btw, is why I also think that the "initialize
with poison" is<br>
pointless and wrong."<br>
<a
href="https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/</a><br>
Unsurprisingly, I strongly agree. ;)<br>
<br>
<br>
GrapheneOS is using zero-init (rather than patching
Clang as it used to, to get<br>
the same result):<br>
<a
href="https://twitter.com/DanielMicay/status/1248384468181643272"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://twitter.com/DanielMicay/status/1248384468181643272</a><br>
<br>
<br>
GCC<br>
There's been mostly silence on the entire topic of
automatic variable<br>
initialization, though there have been patches
proposed in the past for<br>
zero-init:<br>
<a
href="https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html</a><br>
<br>
<br>
Apple<br>
<br>
I can't speak meaningfully here, but I've heard
rumors that they are<br>
depending on zero-init as well. Perhaps someone
there can clarify how<br>
they are using these features?<br>
<br>
<br>
<br>
So, while I understand the earlier objections to
zero-init from a<br>
"language fork" concern, I think this isn't a
position that can really<br>
stand up to the reality of how many projects are
using the feature (even<br>
via non-Clang compilers). Given that so much code is
going to be built<br>
using zero-init, what's the best way for Clang to
adapt here? I would<br>
prefer to just drop the -enable... option entirely,
but I think just<br>
renaming it would be fine too.<br>
<br>
Thoughts/flames? ;)<br>
<br>
-- <br>
Kees Cook<br>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org"
target="_blank" moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br>
<a
href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote>
</div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
cfe-dev mailing list
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>
</pre>
</blockquote>
</div>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank"
moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br>
<a
href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote>
</div>
</div>
</blockquote>
</body>
</html>