<div dir="ltr">So, if I understand correctly, eBPF requires all memory accesses to be statically proven inbounds, so that it's impossible to trigger a runtime-abort condition in the resulting execution via a bad memory dereference operation. In fact, there is no 'exceptional abort' even possible in eBPF: all programs will execute all the way through, and return a result code. <div><br><div>However, the desire is to allow writing "normal" C code, compiled with a "normal" C compiler, so the source language type system does not actually express these constraints directly. This is not made part of the type system, nor is there any explicit "bounds check" semantics as far as the compiler is concerned. Users expect to write typical C code, full of usual C pointer arithmetic, and math and comparisons. Then, they compile their code. Finally, they try to load it, and only then does anything start checking whether it's well-formed, according to these hidden rules of well-formedness. The verifier goes to a lot of trouble to attempt to track down all possible states that program execution may result in, to show that all of them will result in only inbounds pointer accesses, and that loops terminate, etc. (If you've done it wrong, the only place you get an error is from the kernel verifier failing to load the program, never from the compiler.)</div><div><br></div><div>That's an interesting model, and I guess it almost works in practice, but I'm really not sure it can feasibly be made sound in a theoretical sense. Having the compiler know something about the verifier doesn't seem likely to be sufficient. Given that the verifier tracks potential states cross-function/cross-TU, basically any computation could be in-scope for being potentially important to prove bounds from the compiler's POV. It may be the best you can really do is to hack up problematic optimizations as they are discovered to try to work around the issues.</div><div><br></div><div>On the other hand, if bounded-pointer types and constrained range integers were tracked as first-class concepts at the source level, this would seem more feasible. The compiler could ensure that there were no unchecked bounds possible, instead of making you wait for a verifier error. And by proving the boundedness at a high level, the compiler could explicitly ensure the operations contributing to the boundedness-check are emitted in a simplistic manner, without affecting optimization of the rest of the program. Alternatively, the backend could ensure that if any optimization passes do result in an instruction stream which no longer proves the bounds in a verifier-understandable way, it could insert an extraneous min/max to constrain to the bounds it had already proven to itself.</div><div><br></div><div>But making such a fundamental shift in strategy would seem quite tricky. Tricky to specify, probably tricky to implement, and would result in an obviously unusual dialect of C. (Instead of the current secretly unusual dialect which pretends to be normal C until you try to run the program.) There is some pre-existing research in this area, like Microsoft's Checked-C (<a href="https://www.microsoft.com/en-us/research/project/checked-c/">https://www.microsoft.com/en-us/research/project/checked-c/</a>). But I don't know how closely that aligns.</div></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jun 23, 2020 at 2:56 PM Hal Finkel via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
On 6/23/20 12:28 AM, Y Song wrote:<br>
> On Mon, Jun 22, 2020 at 3:42 PM Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>> wrote:<br>
>> On 6/21/20 2:30 AM, Y Song wrote:<br>
>>> On Fri, Jun 19, 2020 at 6:20 PM Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>> wrote:<br>
>>>> On 6/19/20 3:08 PM, Eli Friedman wrote:<br>
>>>>> (Reply inline)<br>
>>>>><br>
>>>>>> -----Original Message-----<br>
>>>>>> From: cfe-dev <<a href="mailto:cfe-dev-bounces@lists.llvm.org" target="_blank">cfe-dev-bounces@lists.llvm.org</a>> On Behalf Of Y Song via cfe-<br>
>>>>>> dev<br>
>>>>>> Sent: Friday, June 19, 2020 11:40 AM<br>
>>>>>> To: Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>><br>
>>>>>> Cc: Andrii Nakryiko <<a href="mailto:andrii.nakryiko@gmail.com" target="_blank">andrii.nakryiko@gmail.com</a>>; Clang Dev <cfe-<br>
>>>>>> <a href="mailto:dev@lists.llvm.org" target="_blank">dev@lists.llvm.org</a>>; Alexei Starovoitov <<a href="mailto:ast@kernel.org" target="_blank">ast@kernel.org</a>><br>
>>>>>> Subject: [EXT] Re: [cfe-dev] Disable certain llvm optimizations at clang<br>
>>>>>> frontend<br>
>>>>>><br>
>>>>>> Just to be more specific about what transformations we want to disable:<br>
>>>>>> (1). Undo a transformation in InstCombine/InstCombineAndOrXor.cpp<br>
>>>>>> (<a href="https://reviews.llvm.org/D72787" rel="noreferrer" target="_blank">https://reviews.llvm.org/D72787</a>)<br>
>>>>>> This transformation created a new variable "var.off" for comparison but<br>
>>>>>> using the original variable "var" later on. The kernel verifier does not<br>
>>>>>> have a better range of "var" at its use place and this may cause<br>
>>>>>> verification failure.<br>
>>>>>><br>
>>>>>> To generalize, BPF prefers the single variable is refined and used later<br>
>>>>>> for each verification. New variable can be created and used for further<br>
>>>>>> value range refinement, but all later usage of old variable should be<br>
>>>>>> replaced with the new variable.<br>
>>>>>> (2). Prevent speculative code motion<br>
>>>>>> (<a href="https://reviews.llvm.org/D82112" rel="noreferrer" target="_blank">https://reviews.llvm.org/D82112</a>, <a href="https://reviews.llvm.org/D81859" rel="noreferrer" target="_blank">https://reviews.llvm.org/D81859</a>)<br>
>>>>>> If the code in the original program may not execute under<br>
>>>>>> certain conditions,<br>
>>>>>> we could like the code not to execute in the final byte code<br>
>>>>>> under the same<br>
>>>>>> condition, regardless of whether it is safe to execute or not.<br>
>>>>>><br>
>>>>>> I guess we could have two attributes here:<br>
>>>>>> - "refine-value-range-with-original-var" (false by default, true for BPF)<br>
>>>>>> - "no-speculative-code-motion" (false by default, true for BPF)<br>
>>>>> Looking at this, I'm worried that there's a bigger disconnect between the way LLVM reasons about IR, vs. what the BPF verifier can actually handle. It isn't reasonable to blindly disable code motion; that's chasing after the symptoms of the problem, not solving the underlying issue.<br>
>>>>><br>
>>>>> If I understand correctly, the underlying issue is that BPF has a notion of a "CFG guard": you can have a condition and a branch, and instructions inside the region guarded by that branch have different rules from instructions outside that region, based on that condition. Both clang and LLVM optimizations are completely<br>
>>> bpf verifier is briefly described in the comments here:<br>
>>> <a href="https://github.com/torvalds/linux/blob/master/kernel/bpf/verifier.c#L38" rel="noreferrer" target="_blank">https://github.com/torvalds/linux/blob/master/kernel/bpf/verifier.c#L38</a><br>
>>><br>
>>> It is a path sensitive verifier. It will go through all possible paths<br>
>>> to ensure all memory accesses are safe.<br>
>>> The verifier is able to carry refined info across function calls.<br>
>>> The condition may refine a value range, e.g.,<br>
>>> unsigned a = foo(); /* " a" is an unsigned int, could be any value<br>
>>> 0 <= "a" <= UINT_MAX */<br>
>>> if (a < 128) {<br>
>>> /* varifier concludes "a" range now is [0, 127]. */<br>
>>> ... * (p + a) ... /* test whether *(p + [0, 127]) is legal or not.<br>
>>> }<br>
>>><br>
>>> But verifier is not smart enough to do backtracking now, so if you have code<br>
>>> unsigned a = foo();<br>
>>> b = a + 10;<br>
>>> if (b < 128) {<br>
>>> ... * (p + a) ... /* b will be [10, 127], but a is [0, UINT_MAX]. */<br>
>>> }<br>
>>> The verifier may reject the above code as (p + a) may be out of bound<br>
>>> of legal permitted memory range.<br>
>>><br>
>>> Why verifier did not then implement backtracking? There are a few reasons:<br>
>>> - This is actually not very common. As long as the program does not<br>
>>> have control flow like the above triggering a particular instcombine<br>
>>> transformation, the compiler actually did a good job avoiding the<br>
>>> above pattern to avoid a second copy of the same variable. But if this<br>
>>> actually happens, it often cause a lot of<br>
>>> frustration for developers as their codes are perfectly okay and they<br>
>>> often do not know how to please<br>
>>> the verifier.<br>
>>> - verifier is already very complex. Adding generic backtracking<br>
>>> will add a lot of complexity to verifier.<br>
>>> More states may need to be kept during verification process and this<br>
>>> will require more kernel memory. The verifier will become slower too.<br>
>>><br>
>>> The same for speculative code motion. Yes, the current verifier may<br>
>>> reject a pointer arithmetic (say pkt_ptr += UINT_MAX). But in really,<br>
>>> this pkt_ptr += UINT_MAX may just a speculative move and later on<br>
>>> pkt_ptr = original_legal_pkt_ptr + 10 will assign proper value before<br>
>>> actual memory access. In this case, the verifier will be wrong to<br>
>>> reject the program.<br>
>>><br>
>>> Not all speculative code motion will cause problems. Some scalar<br>
>>> speculative code motion totally fine. Or if the speculative code<br>
>>> motion still within ptr valid range, it will be fine too.<br>
>>><br>
>>> My original proposal to disable certain optimizations are a big<br>
>>> hammer, just to get it work so the developer can move on. I totally<br>
>>> agree that using func attribute to disable certain specific<br>
>>> optimizations may happen to work for some cases, but it may not work<br>
>>> for all cases since other optimizations may perform optimizations.<br>
>><br>
>> Thanks, this is helpful. One question: Is the information used by the<br>
>> verifier available at compile time? It seems like what you want here is<br>
> We have identified some compiler optimizations in instcombine and<br>
> simplifyCFG may generate verifier unfriendly code. Yes, we could do<br>
> IR analysis to identify these patterns, may not be 100% but should be<br>
> close.<br>
><br>
> Completely modelling what verifier did in llvm is complex. Verifier<br>
> interacts with kernel subsystem with helper functions and these<br>
> helper functions having some semantics utilized with the verifier.<br>
> Bringing kernel helpers to llvm does not sound a good idea.<br>
> Verifier implements a path sensitive verification framework. To replicate<br>
> in llvm probably not a good idea as there are constant changes to<br>
> verifier and it is very hard to keep them sync. So here, in llvm the<br>
> best probably<br>
> to identify known IR patterns where verifier unfriendly code may be<br>
> generated and prevent them from generating (preferred) or undo them<br>
> if already generated.<br>
<br>
<br>
Honestly, I don't think that you're thinking about this in the <br>
most-useful way. Obviously, we strive for code reuse and reduced <br>
duplication. However, imagine if this were, not a VM with a <br>
software-programmed loader, but a hardware architecture. The verifier <br>
would be part of the hardware, and it's semantics would be part of the <br>
hardware ISA. In this case, updates to the verifier would, without <br>
particular debate, necessitate changes to the compiler. When the <br>
hardware is updated, the compiler needs to be updated. Keeping the two <br>
in sync is just part of the normal compiler-development process. Will <br>
this be difficult? Maybe. But, as Eli mentioned, you can use <br>
conservative approximations where modeling the details doesn't matter <br>
too much. Regardless, at a high level, yes, you should duplicate the <br>
relevant parts of the verifier logic in the backend. You can then use <br>
that to move code that is correct only assuming speculation is <br>
side-effect free (consistent with LLVM's semantics) into the blocks with <br>
the eventual users (which is required by the verifier).<br>
<br>
The idea that you're doing to disable particular optimizations to fix <br>
this problem is not robust or sound. Any number of other transformations <br>
could introduce similar issues. Are you going to constantly audit <br>
everything? The input source-language code could have the same problem. <br>
You can, of course, say that the source-language inputs are not really <br>
C/C++ code, but some similar-looking language with different semantics, <br>
but frankly, you'll be fighting an uphill battle the whole way. I <br>
understand that you already do this to some extent (e.g., by disallowing <br>
unbounded loops), but these semantics are far more pervasive (it seems <br>
to me, anyway, as qualitatively different). As you can probably tell, I <br>
really don't recommend trying to go down this path.<br>
<br>
Thanks again,<br>
<br>
Hal<br>
<br>
<br>
><br>
><br>
>> for the backend to iterate over the IR, perform the verification, and in<br>
>> cases where it would fail, push the relevant parts of the use-def chain<br>
>> down into the blocks with their eventual users.<br>
> This is a good idea. The issue is instCombine and SimplifyCFG happens<br>
> before any backend IR passes. So backend will have to undo some of<br>
> these optimizations. For speculative code motion, the backend probably<br>
> hard to do since it does not even know the code is speculative generated.<br>
><br>
> Can we prevent these optimizations happening in the first place?<br>
><br>
> In addition to my previous three ideas:<br>
> - function attribute to state intention,<br>
> - clang IR generation with certain variable barriers.<br>
> - new TLI functions used by bpf backend and additional bpf backend<br>
> IR phases.<br>
><br>
> Two more ideas here:<br>
> (1). Is it possible to implement an generic IR pass (before InstCombine<br>
> and SimplifyCFG) which takes some hints from backend to modify IR<br>
> to favor BPF verifier?<br>
> (2). is it possible that clang frontend provides a hook for target to<br>
> provide a pass manager or at least having a say<br>
> to the pass manager? This way, target may disable certain passes?<br>
> I realize this is a big hammer approach. But Considering here that our<br>
> main goal is to improve development experience and performance<br>
> is not a goal, maybe it is still acceptable?<br>
><br>
>> -Hal<br>
>><br>
>> ...<br>
>><br>
>><br>
>>>> -Hal<br>
>>>><br>
>>>><br>
>>>> --<br>
>>>> Hal Finkel<br>
>>>> Lead, Compiler Technology and Programming Languages<br>
>>>> Leadership Computing Facility<br>
>>>> Argonne National Laboratory<br>
>>>><br>
>> --<br>
>> Hal Finkel<br>
>> Lead, Compiler Technology and Programming Languages<br>
>> Leadership Computing Facility<br>
>> Argonne National Laboratory<br>
>><br>
-- <br>
Hal Finkel<br>
Lead, Compiler Technology and Programming Languages<br>
Leadership Computing Facility<br>
Argonne National Laboratory<br>
<br>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote></div>