[PATCH] D147078: [LICM][BPF] Disable hoistMinMax() optimization for BPF target

Wed Mar 29 12:07:04 PDT 2023

anakryiko added a comment.

In D147078#4231608 <https://reviews.llvm.org/D147078#4231608>, @nikic wrote:

> In D147078#4231270 <https://reviews.llvm.org/D147078#4231270>, @anakryiko wrote:
>
>>> If so, I think it would make the most sense to add an undo transform in the BPF backend.
>>
>> Genuine curiosity (and keep in mind I'm no compiler expert at all), but why doing transformation and then customly undoing it in specific BPF backend would be a preferable solution to allowing backends to turn off some known-to-hurt transformations in the first place? It seems more complicated and error-prone to do the dance of undoing, rather than have a generic interface to prevent transformation in the first place.
>>
>> Note, this is not the only transformation that hurts BPF verification, @yonghong-song previously added "undo transformation" logic, but it seems like a not very sustainable and very convoluted solution. So we are asking if we can have a simpler and more reliable opt-out for some small set of transformations.
>
> Just blocking a transform is generally insufficient, because it does not handle the case where the code uses the problematic pattern in the first place. If the input code already contained the `x < min(y, z)` pattern, it would fail the verifier and converting it into `x < y && y < z` would be necessary to make it pass.

This is normally the problem, because people writing BPF code are aware of such limitations and do write input C code that should pass BPF verification (e.g., no one would do `x < min(y,z)`, it is always written as `x < y && x < z`, where either y or z is a constant; that makes this code verifiable to BPF verifier in kernel).

So the issue here is that Clang transforms original code to something that's not verifiable, with user (programmer) having no control. And that's what we are trying to address.

We have or had similar issues with few other transformations. E.g., if I remember correctly, typical (and, if not transformed, correctly verifiable) code:

  if (x >= y && x < z) { /* use x */ }

would be transformed to

  x` = x - y;
  if (x` < (z - y)) { /* use **original** x */ }

which breaks user code. Users have to do all sorts of tricks just to prevent this transformation.

Another one that constantly causes problems compiler transforming this:

  struct some_type *t = ...;
  if (t != NULL)
    x = t->some_field

to reordered low-level assembly that does offset adjustment before the NULL check, again breaking BPF verification for no good reason (as far as users are concerned). Roughly in the following manner (pseudo C code, of course, just to demonstrate):

  struct some_type *t = ...;
  int *field_ptr = &t->some_field; /* this is too early, verifier rejects this */
  if (t != NULL)
      x = *field_ptr

> If certain canonicalization transforms are undesirable for a specific target, this is always addressed by adding backend undo transforms, not by disabling canonicalizations. Use of TTI hooks inside canonicalization passes is generally rejected as a matter policy -- you need a very good case for that (e.g. an undo transform being impossible for some reason).

As I said, not a compiler expert and I don't know LLVM/Clang internals, but this pattern matching + undoing transform seems very error-prone and seems like always leaving some corner cases not covered. So while I understand overall desire to minimize the list of optional transforms, it still seems much cleaner and resulting in much simpler overall LLVM/Clang code to just prevent transform than do + undo it.

This is a very significant and recurring problem. People keep complaining and reporting very confusing (to them) issues, only due to some Clang transformations that just hurt BPF target without adding any benefit.

It's described as "fighting with verifier", commonly, even though it's really mostly "fighting with compiler". I know it's a bit hard to emphatize without having experienced this over and over personally or while helping multitude of users, but this is a really big pain point. Any help from LLVM community in making this go away (or at least be reduced) would be greatly appreciated by entire BPF community. Thanks!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147078/new/

https://reviews.llvm.org/D147078