[cfe-dev] [RFC] automatic variable initialization

Tue Dec 11 15:18:30 PST 2018

We went with zero initialization for production builds in Visual Studio because we anticipate zero being the fastest, smallest code gen, and safest value for the following cases:

  *   Pointer: A non-zero pointer could potentially point at valid code. On Windows x64, the first 64kb of the virtual address space is guaranteed to not be mappable and the last 512GB of the virtual address space (today) has the space guarantee. So null pointer or near-null pointer dereferences are denial of service at worst.
  *   Size: If you forget to set a size as an out parameter we’d rather it’s set to 0 so you don’t index out of bounds.
  *   Index: Same as size.

We are using this mitigation to downgrade vulnerabilities from remote code execution, elevation of privilege, or information disclosure, down to denial of service at worst.

Assuming the denial of service isn’t for a scenario where it matters (there are many types of DOS we don’t really worry about for security), we will not bother servicing the vulnerabilities down-level and instead only fix them for new code.

It is helpful for us to have deterministic behavior to help accomplish this goal. If an engineer knows a value is always set to 0, they can quickly determine if the bug report is actually exploitable or not. If it is initialized to a random or compile time constant value, we’ll need to do more work to determine if the bug is exploitable.

We are not zero initializing CHK builds to prevent folks from taking a dependency on this feature.

Joe

From: Kostya Serebryany <kcc at google.com>
Sent: Tuesday, December 11, 2018 3:04 PM
To: Mehdi AMINI <joker.eph at gmail.com>; Jann Horn <jannh at google.com>
Cc: David Blaikie <dblaikie at gmail.com>; Richard Smith <richard at metafoo.co.uk>; Clang Dev <cfe-dev at lists.llvm.org>; Joe Bialek <jobialek at microsoft.com>
Subject: Re: [cfe-dev] [RFC] automatic variable initialization

One more dimension in this discussion (zero-init vs pattern-init) is what will security get from those options.
My second-hand knowledge here suggests that zero-init may have *on average* better security guarantees than non-zero init.
For example if the uninitialized data is interpreted as a length of something, it better be zero than any large number or, worse, a negative signed number.
+Jann Horn<mailto:jannh at google.com> and Joe Bialek who have first-hand knowledge here.

--kcc

On Thu, Dec 6, 2018 at 3:58 PM Kostya Serebryany <kcc at google.com<mailto:kcc at google.com>> wrote:

On Thu, Dec 6, 2018 at 3:55 PM Mehdi AMINI <joker.eph at gmail.com<mailto:joker.eph at gmail.com>> wrote:

On Thu, Dec 6, 2018 at 3:43 PM Kostya Serebryany <kcc at google.com<mailto:kcc at google.com>> wrote:

On Thu, Dec 6, 2018 at 3:03 PM Mehdi AMINI <joker.eph at gmail.com<mailto:joker.eph at gmail.com>> wrote:

On Thu, Dec 6, 2018 at 1:01 PM Kostya Serebryany <kcc at google.com<mailto:kcc at google.com>> wrote:

On Wed, Dec 5, 2018 at 6:07 PM Mehdi AMINI via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
I have the same question as David: I don't understand why this isn't just an experimental build option disabled by default? (Google, Apple, etc. can just enable it in their build script, no need to patch the source).

I will need to rebuild half a dozen compiler binaries to do the measurements I need.
This is going to double the cost of the effort for me because it adds too many extra moving pieces.

Not sure I follow, you have to build the compiler anyway to get it after the code is patched?
The only thing you would have to do is one CL to enable the build flag (that expose the command line flag) inside your codebase and then you get your toolchain as usual?

Once the JF's patch is in, I'll have it in most production compilers I care about in 2-6 weeks, w/o any effort on my side.
None of those builders (easily) support adding custom patches, and even a build flag is going to be very non-trivial.
I can deal with one or two of those builds relatively easily, but not with the rest.

What builders are you referring to? So far I was assuming we were talking about your internal infrastructure that produces your production compilers.

I am talking about our internal infra.
But we have multiple independent builders from multiple independent compiler users (e.g. Chrome and Android are using different compiler builds).

--
Mehdi

The only argument I read in the thread was about allowing people to make performance measurement without rebuilding the compiler, but I have a hard time reconciliation this with the fact that we're talking about not shipping this before performing the actual measurements?

I expect that anyone that cares enough about the perf impact of this to influence the development of the feature in clang should already be rebuilding their compiler today.

--
Mehdi

On Mon, Dec 3, 2018 at 5:58 PM David Blaikie via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
Just out of curiosity - what's the major difference between a build-time off-by-default feature and a build-time on-by-default-but-off-in-release branches feature? If we're only targeting groups that build/release the compiler themselves, then they're likely able to opt-in to a build-time feature easily enough, I'd think? & then there's no need to make our releases different from day-to-day builds?

But sounds like folks are in general agreement of a way forward, so I don't want to disrupt/delay that.
On Wed, Nov 28, 2018 at 11:14 PM Chandler Carruth <chandlerc at gmail.com<mailto:chandlerc at gmail.com>> wrote:
Suggested compromise technique to at least get an initial set of numbers:

1) Require a special, long, ugly flag name.
2) Make it a CC1 flag, requiring -Xclang ... to use.
3) Emit a warning by default (that cannot be suppressed with a -Wno-... flag) when this flag is enabled.
4) Commit to never including this flag in any upstream release. Either we remove it before the next release branches or we revert it on the branch.

Most of the folks we're hoping to get performance data with are willing to use a not-yet-released build of Clang. They won't have to actually patch it in any way. They will have strong reminders to not deploy this in any way due to the warning.

Thoughts?

On Wed, Nov 28, 2018 at 4:34 PM Kostya Serebryany via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
On Wed, Nov 28, 2018 at 3:28 PM David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>> wrote:

On Wed, Nov 28, 2018 at 3:17 PM Kostya Serebryany <kcc at google.com<mailto:kcc at google.com>> wrote:

Seems easier to me to separate the two pieces - move ahead with the non-zero options, and separate the discussion on the zero option. You can present performance numbers from what you can measure without shipping a compiler with the feature - and if those numbers are sufficiently compelling compared to the risks of slicing the language, then perhaps we go that way.

This approach will significantly impair my ability to do the measurements I need.

I'm aware waht I'm proposing would make it more difficult for some people to take measurements - that's a tradeoff to be sure - one where I err in this direction.

Specifically for Google though - would it be that difficult for Google to opt-in to a certain build configuration of LLVM?

Absolutely yes.
Google is not just a single monolithic code base<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdelivery.acm.org%2F10.1145%2F2860000%2F2854146%2Fp78-potvin.pdf%3Fip%3D104.133.8.94%26id%3D2854146%26acc%3DOA%26key%3D4D4702B0C3E38B35%252E4D4702B0C3E38B35%252E4D4702B0C3E38B35%252E5945DC2EABF3343C%26__acm__%3D1543446999_3aadcd36f657e2297430c38bee93f16c&data=02%7C01%7Cjobialek%40microsoft.com%7C533e0fe1444d4b5a3a6b08d65fbcf479%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636801662510848537&sdata=dd4I1xQ5qfawzLwsnBM3iqsypqZ%2BlFswj5IgKa%2FLHbw%3D&reserved=0>.

Couldn't access this link ("An error occurred while processing your request" - but yeah, I understand there's a bunch of different pieces of Google beyond the "stuff that runs in data centers" piece we mostly support.

Besides that monolithic thing, we have Android, Chrome, ChromeOS, Fuchsia, and a bazillion of smaller efforts that use their own toolchains.

Still, most/all of these build their own compilers, I think? But yeah, that adds an opt-in overhead to each project, for sure.

In some cases the most reliable and complete way of measuring performance changes is to submit the changes to revision control,
and let the performance bots shew it for a couple of days. That's how we iterated with the LLVM's CFI in Chrome.
We will also need to work with the upstream Linux kernel -- it's hard enough for them to use clang and a modified clang will cost us much more effort.

Yeah, I can imagine that one's a bit trickier - how's performance evaluation of the kernel done?

I don't think anyone knows that. :-|
And requiring a compiler patch will shift the problem from "hard" to "I'd better do something else".

(though, again, I imagine a fair amount of progress could be made without the zero-init feature - perhaps enough to say "hey, here are all the places we have run tests & seen the performance tradeoff is worthwhile for us (& possibly that it's close to the zero-init case, but that's sort of orthogonal, imho - that it's worthwhile is the main thing) - perhaps other folks would be willing to test it (non-zero init) & see if it's worthwhile to them - and if it isn't/they're interested in more performance, maybe all that evidence we can gain from the places where it's easy for us to rebuild compilers, etc, would make it interesting enough to motivate someone to do build the kernel with a custom compiler & do some performance measurements, etc...

Sorry that was a bit rambly, anyway.

- Dave

We/Google do build the compiler from scratch, I assume we pick the configuration options we build with & some of them probably aren't the defaults for a release build of LLVM. So if it was important that Google's production compiler had these features enabled (rather than building a test compiler for running some experiments), that doesn't seem (at least to me, at this moment) especially prohibitive, is it?

_______________________________________________

cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fcfe-dev&data=02%7C01%7Cjobialek%40microsoft.com%7C533e0fe1444d4b5a3a6b08d65fbcf479%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636801662510858549&sdata=J1qIBl35xJOU4LXt2pmJKrl7Jo3RzuG04e7mWmXdULo%3D&reserved=0>
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fcfe-dev&data=02%7C01%7Cjobialek%40microsoft.com%7C533e0fe1444d4b5a3a6b08d65fbcf479%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636801662510868553&sdata=wC0zWaGYC6YVWA3W7X3PEUjbePjhZdOoKVy5vzD6y2I%3D&reserved=0>
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fcfe-dev&data=02%7C01%7Cjobialek%40microsoft.com%7C533e0fe1444d4b5a3a6b08d65fbcf479%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636801662510868553&sdata=wC0zWaGYC6YVWA3W7X3PEUjbePjhZdOoKVy5vzD6y2I%3D&reserved=0>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181211/54091a82/attachment.html>