[PATCH] D103958: [WIP] Support MustControl conditional control attribute

Marco Elver via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jun 9 16:45:08 PDT 2021


melver added a comment.

In D103958#2809145 <https://reviews.llvm.org/D103958#2809145>, @efriedma wrote:

> In D103958#2808966 <https://reviews.llvm.org/D103958#2808966>, @melver wrote:
>
>> In D103958#2808861 <https://reviews.llvm.org/D103958#2808861>, @efriedma wrote:
>>
>>> I don't like using metadata like this.  Dropping metadata should generally preserve the semantics of the code.
>>
>> Anything better for this without introducing new instructions? Would an argument be reasonable?
>
> If we really want to make it part of the branch, maybe add an intrinsic that can be used with callbr.  Not something we've done before, but the infrastructure should be mostly there.
>
> That said, I'm not sure this is the best approach.  Alternative proposal:
>
> We could add a regular intrinsic.  Just ignore the control flow at the IR level, and come up with a straight-line blob that just does the right thing.  I think we'd want to actually perform the load as part of the intrinsic, to avoid worrying about the consume dependency.  So we'd have an intrinsic "__builtin_load_with_control_dependency()". It would lower to something along the lines of  `asm("ldr %0, [%1]; cbnz %0, .+4":"=r"(dest):"r"(x):"memory");` on AArch64.  The differences between the intrinsic and just using the asm I wrote:
>
> 1. We weaken the "memory" clobber to something that more accurately matches what we need.
> 2. We add a compiler transform to check if the branch is redundant, late in the optimization pipeline, and remove it if it is.
>
> I think this produces the code you want, and it should be easier to understand and maintain.

Interesting, but probably doesn't quite work if I understood right -- however, perhaps it could solve something related (not part of this work, see below [footnote]).

Not every `READ_ONCE()` the kernel has needs to be a `load_with_control_dependency()`, which if I read it right, would happen if we just do, e.g.:

  #define READ_ONCE __builtin_load_with_control_dependency
  int x;
  void foo(void) { return READ_ONCE(x); }

And they really want to avoid introducing another set of primitives, like `READ_ONCE_CTRL()`, because if we did that, I think it'd be reasonable to upgrade all `READ_ONCE_CTRL()` to acquires and we're done (suggested by Will Deacon in [1]). Yet upgrading all `READ_ONCE()` to acquire is not acceptable in general (do note, it's not just AArch64, also POWER and Armv7).  For now, it'd be good to avoid this -- in particular, existing code like the following would become less clear or less optimal:

  x = READ_ONCE(..);
  y  = READ_ONCE(..);
  ... lots of other code ...
  if (y) { ... do other stuff ... } // <--- no control dependency here
  if (x && y) { WRITE_ONCE(..) } // <-- only want control  dependency here

Which is why the kernel folks probably wouldn't be too happy with more primitives as it likely penalizes more than with just marking the branch itself. Per [1] new load-primitives are probably a last resort assuming the compiler can deliver a nice mechanism to ensure control-dependencies remain (this work here).

Thanks.

----

[1] https://lore.kernel.org/linux-arch/20210607160252.GA7580@willie-the-truck/

[footnote] Re the "memory" clobber, Linus asks for more fine-grained asm clobber: https://lore.kernel.org/linux-arch/CAHk-=wjwXs5+SOZGTaZ0bP9nsoA+PymAcGE4CBDVX3edGUcVRg@mail.gmail.com/
If you see a way to support this, I think it'd help in other places (e.g. kernel's one-directional memory barriers).

Tangentially, per this presentation:
https://github.com/ClangBuiltLinux/plumbers-2020-slides/blob/master/LPC_2020_--_Dependency_ordering.pdf
there is another problem, which are address dependencies aka `memory_order_consume`. In reality the kernel wants every `READ_ONCE()` be something very close to `memory_order_consume`, with the compiler figuring out the optimal thing to do. Unfortunately, this is not the reality today. Paul McKenney et al. has been exploring this in: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0750r1.html -- but since addr-deps are much less likely to be optimized away, I think the kernel will do nothing about it in the near term. ctrl-deps on the other hand are more of a worry to the Linux kernel community right now.

I can't summarize this well enough here, so I would strongly recommend: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0124r6.html#Control%20Dependencies


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103958/new/

https://reviews.llvm.org/D103958



More information about the cfe-commits mailing list