[cfe-dev] Disable certain llvm optimizations at clang frontend

Sun Jun 21 00:30:49 PDT 2020

On Fri, Jun 19, 2020 at 6:20 PM Hal Finkel <hfinkel at anl.gov> wrote:
>
>
> On 6/19/20 3:08 PM, Eli Friedman wrote:
> > (Reply inline)
> >
> >> -----Original Message-----
> >> From: cfe-dev <cfe-dev-bounces at lists.llvm.org> On Behalf Of Y Song via cfe-
> >> dev
> >> Sent: Friday, June 19, 2020 11:40 AM
> >> To: Hal Finkel <hfinkel at anl.gov>
> >> Cc: Andrii Nakryiko <andrii.nakryiko at gmail.com>; Clang Dev <cfe-
> >> dev at lists.llvm.org>; Alexei Starovoitov <ast at kernel.org>
> >> Subject: [EXT] Re: [cfe-dev] Disable certain llvm optimizations at clang
> >> frontend
> >>
> >> Just to be more specific about what transformations we want to disable:
> >>    (1). Undo a transformation in InstCombine/InstCombineAndOrXor.cpp
> >>          (https://reviews.llvm.org/D72787)
> >>          This transformation created a new variable "var.off" for comparison but
> >>          using the original variable "var" later on. The kernel verifier does not
> >>          have a better range of "var" at its use place and this may cause
> >>          verification failure.
> >>
> >>          To generalize, BPF prefers the single variable is refined and used later
> >>          for each verification. New variable can be created and used for further
> >>          value range refinement, but all later usage of old variable should be
> >>          replaced with the new variable.
> >>     (2). Prevent speculative code motion
> >>            (https://reviews.llvm.org/D82112, https://reviews.llvm.org/D81859)
> >>           If the code in the original program may not execute under
> >> certain conditions,
> >>           we could like the code not to execute in the final byte code
> >> under the same
> >>           condition, regardless of whether it is safe to execute or not.
> >>
> >>      I guess we could have two attributes here:
> >>         - "refine-value-range-with-original-var" (false by default, true for BPF)
> >>         - "no-speculative-code-motion" (false by default, true for BPF)
> > Looking at this, I'm worried that there's a bigger disconnect between the way LLVM reasons about IR, vs. what the BPF verifier can actually handle.  It isn't reasonable to blindly disable code motion; that's chasing after the symptoms of the problem, not solving the underlying issue.
> >
> > If I understand correctly, the underlying issue is that BPF has a notion of a "CFG guard": you can have a condition and a branch, and instructions inside the region guarded by that branch have different rules from instructions outside that region, based on that condition.  Both clang and LLVM optimizations are completely

bpf verifier is briefly described in the comments here:
  https://github.com/torvalds/linux/blob/master/kernel/bpf/verifier.c#L38

It is a path sensitive verifier. It will go through all possible paths
to ensure all memory accesses are safe.
The verifier is able to carry refined info across function calls.
The condition may refine a value range, e.g.,
   unsigned a = foo(); /* " a" is an unsigned int, could be any value
0 <= "a" <= UINT_MAX */
   if (a < 128) {
     /* varifier concludes "a" range now is [0, 127]. */
     ...  * (p + a) ...   /* test whether *(p + [0, 127]) is legal or not.
  }

But verifier is not smart enough to do backtracking now, so if you have code
   unsigned a = foo();
   b = a + 10;
   if (b < 128) {
       ... * (p + a) ...    /* b will be [10, 127], but a is [0, UINT_MAX]. */
   }
The verifier may reject the above code as (p + a) may be out of bound
of legal permitted memory range.

Why verifier did not then implement backtracking? There are a few reasons:
   - This is actually not very common. As long as the program does not
have control flow like the above triggering a particular instcombine
transformation, the compiler actually did a good job avoiding the
above pattern to avoid a second copy of the same variable. But if this
actually happens, it often cause a lot of
frustration for developers as their codes are perfectly okay and they
often do not know how to please
the verifier.
   - verifier is already very complex. Adding generic backtracking
will add a lot of complexity to verifier.
More states may need to be kept during verification process and this
will require more kernel memory. The verifier will become slower too.

The same for speculative code motion. Yes, the current verifier may
reject a pointer arithmetic (say pkt_ptr += UINT_MAX). But in really,
this pkt_ptr += UINT_MAX may just a speculative move and later on
pkt_ptr = original_legal_pkt_ptr + 10 will assign proper value before
actual memory access. In this case, the verifier will be wrong to
reject the program.

Not all speculative code motion will cause problems. Some scalar
speculative code motion totally fine. Or if the speculative code
motion still within ptr valid range, it will be fine too.

My original proposal to disable certain optimizations are a big
hammer, just to get it work so the developer can move on. I totally
agree that using func attribute to disable certain specific
optimizations may happen to work for some cases, but it may not work
for all cases since other optimizations may perform optimizations.

Option 1:
=======

One thing to do this provide backend hooks to influence IR
optimizations. We do not need to block all control flow optimization
or speculative code motion. But if backend can have a say will be
great. Currently,
getUserCost() is a hook BPF can utilize. We need another one for the
speculative code motion part ( https://reviews.llvm.org/D81859).
Hopefully this will be enough.
We do not have hook for InstCombine so BPF backend needs to do pattern
matching and undo the transformation.

This is probably the simplest. I should cover most cases I am aware
of. But yes, in the future, llvm optimization changes, BPF backend may
need change too.

Option 2:
=======

Now coming to proper IR design to encode constraints desired by BPF
verifier. What we really want is to
"serialize" variable uses to avoid backtracking and speculative code
motion. The following is a specific example,
   a = bar();
   if (a >= 0) {
      if (a <= 16) {
            ...
      }
    }
We would like to generate a code like
// var_barrier(a) defines and uses a
#define var_barrier(a) asm volatile ("" : "=r"(a) : "0"(a))
    a = bar();
    var_barrier(a);
    if (a >= 0) {
        var_barrier(a);
        if (a <= 16) {
            ...
        }
     }
The above code will effectively disable control flow optimization and
will generate verifier friendly code.

For speculative code motion part,
  a = bar();
  if (a < 10)  {
      ... *(p + a) ...
  }
We can do
   a = bar();
   var_barrier(a);
   if (a < 10) {
       var_barrier(a);
       ... *(p + a) ...
   }
This will prevent speculative code motion as well.

The above code is one way bpf programmer may do to prevent certain
llvm optimizations. I am thinking how we could do automatically to
encode in IR. Probably during clang code generation, BPF target can
insert inline asm codes like the above, as directed by some bpf target
options. This way, downstream llvm optimization will not need any
change. BPF may provide two target options, one to "serialize" all
local scalar variables, or to "serialize" some local scalar variables
based on an attribute like
   int a __attribute__((no_reordering));
"no_reordering" probably not a good name based on the above example...

>unaware of this.  Currently, you're working around the problem in various ways, though source-code hacks and pattern-matching hacks.  But still, nothing is really aware this is happening.
> >
> > I'm not sure exactly what the solution looks like here, but a " no-speculative-code-motion" attribute is not heading in the right direction.  If we want to actually make code generation reliable, we need to have rules in LangRef we can point to that describe what operations are legal/illegal.  And they need to be phrased in terms of the IR itself, independent of any specific transform.  This probably involves modifying the LLVM IR generated by clang in a more fundamental way than just adding new attributes.
> >
> > -Eli
>
>
> +1 to all of this. We need to have well-defined semantics at the IR level.
>
> A lot of these problems seem to come from the fact that the verifier is
> verifying each value at the point of calculation instead of at the point
> each value contributes to an operation with side effects. If the

For speculative code motion, yes, see my above explanation.

> verifier cannot be improved in this regard, it seems like you need to

See my above explanation why we want to avoid verifier and to seek
llvm's help to generate verifier friendly code.

> represent these side effects at the IR level, and perhaps other aspects
> of the state of the verifier. For example, we might essentially add the
> inaccessiblememonly attribute to condition calculations and other
> relevant intrinsics. Is the verifier worried about all arithmetic
> somehow, or only pointer arithmetic (from GEPs)?

I have some more detailed explanation in the above, hope it may help
with proper IR design. pointer arithmetic is the verifier checking point.
the offset calculation, which may be all arithmetic, may influence the pointer
arithmetic verification depending on its value range.

>
>   -Hal
>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>