[PATCH] D67399: [ARM] Follow AACPS standard for volatile bitfields

Wed Sep 11 01:56:01 PDT 2019

dnsampaio marked 4 inline comments as done.
dnsampaio added a comment.

Hi @jfb. In a example such as:

  struct { int a : 1; int b : 16; } S;
  extern int expensive_computaion(int v);
  void foo(volatile S* s){
    s->b = expensive_computation(s->b);
  }

There is no guarantee that `s->a` is not modified during a expensive computation, so it must be obtained just before writing the `s->b` value, as `a` and `b` share the same memory position. This is already done by llvm. Indeed, the exact output would be

  define void @foo(%struct.S* %S) local_unnamed_addr #0 {
  entry:
    %0 = bitcast %struct.S* %S to i32*
    %bf.load = load volatile i32, i32* %0, align 4
    %bf.shl = shl i32 %bf.load, 15
    %bf.ashr = ashr i32 %bf.shl, 16
    %call = tail call i32 @expensive_computation(i32 %bf.ashr) #2
    %bf.load1 = load volatile i32, i32* %0, align 4 ; <<<== Here it obtains the value to s->a to restore it.
    %bf.value = shl i32 %call, 1
    %bf.shl2 = and i32 %bf.value, 131070
    %bf.clear = and i32 %bf.load1, -131071
    %bf.set = or i32 %bf.clear, %bf.shl2
    store volatile i32 %bf.set, i32* %0, align 4
    ret void
  }

These extra loads here are required to make uniform the number of times the volatile bitfield is read, independent if they share or not memory with other data.

We could have it under a flag, such as `-faacps-volatilebitfield`, disabled by default.

The other point not conformant to the AACPS is the width of the loads/stores used to obtain bitfields. They should be the width of the container, if it does that would not overlap with non-bitfields. Do you have any idea where that could be computed? I imagine that would be when computing the alignment of the elements of the structure, where I can check if the performing the entire container width load would overlap with other elements. Could you point me where that is done?

================
Comment at: clang/test/CodeGen/aapcs-bitfield.c:541
 // BE-NEXT:    [[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST9:%.*]], %struct.st9* [[M:%.*]], i32 0, i32 0
+// BE-NEXT:    [[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // BE-NEXT:    store volatile i8 1, i8* [[TMP0]], align 4
----------------
jfb wrote:
> These are just extra loads? Why?
Yes, these are just extra loads. As the AACPS describes, every write requires to perform a load as well, even if all bits of the volatile bitfield is going to be replaced.

================
Comment at: clang/test/CodeGen/aapcs-bitfield.c:552
 // LE-NEXT:    [[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST9:%.*]], %struct.st9* [[M:%.*]], i32 0, i32 0
 // LE-NEXT:    [[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // LE-NEXT:    [[INC:%.*]] = add i8 [[BF_LOAD]], 1
----------------
jfb wrote:
> Why isn't this load sufficient?
Technically speaking, that is the load for reading the bitfield, not the load required when writing it.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67399/new/

https://reviews.llvm.org/D67399