[PATCH] D34515: [ARM] Materialise some boolean values to avoid a branch

Roger Ferrer Ibanez via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 6 09:55:08 PDT 2017


rogfer01 added a comment.

Hi @efriedma,

thanks for pointing me to the `ADDCARRY`/`SUBCARRY` nodes. I tried to use them but I ran into serious problems (maybe due to some inexperience on my side working with SelectionDAG). First if I want to use them I assume I have to make them legal for the ARM back end, this means that I have to lower them somehow. After the short discussion in the list about the semantics of `SUBCARRY` I made `ISD::ADDCARRY` to lower to `ARMISD::ADDC` and `ISD::SUBCARRY` to lower to `ARMISD::SUBC` plus doing an extra `(ISD::ADD 1, c)` to the carry result, to change from borrow semantics (`ISD::SUBCARRY` semantics) to ARM carry semantics.

One side effect of doing this is that many operations that were previously legalized using `ISD::ADD` and `ISD::ADDC` like `i64 add`/`sub` are now legalized using `ISD::ADDCARRY`. And one generic combiner for `ISD::ADDCARRY` when it has a zero carry transforms it into `ISD::UADDO` (similarly `ISD::USUBO` for `ISD::SUBCARRY` without borrow). This, unfortunately, leads to very poor (and probably wrong) code involving `msr` / `mrs` instructions.

I tried to fix some of these issues but still it is very easy that sometimes the carry value is moved to some general purpose register and the backend decides to use `mrs`. For instance a case like this

  define i64 @f3(i32 %al, i32 %bl) {
  entry:
      ; unsigned wide add
      %aw = zext i32 %al to i64
      %bw = zext i32 %bl to i64
      %cw = add i64 %aw, %bw
      ; ch == carry bit
      %ch = lshr i64 %cw, 32
  	%dw = add i64 %ch, %bw
  	ret i64 %dw
  }

leads to something like this before instruction selection

  SelectionDAG has 13 nodes:
    t0: ch = EntryToken
    t4: i32,ch = CopyFromReg t0, Register:i32 %vreg1
    t17: ch,glue = CopyToReg t0, Register:i32 %R0, t28
    t19: ch,glue = CopyToReg t17, Register:i32 %R1, t28:1, t17:1
        t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0
      t29: i32,i32 = ARMISD::ADDC t2, t4
    t28: i32,i32 = ARMISD::ADDE t4, Constant:i32<0>, t29:1
    t20: ch = ARMISD::RET_FLAG t19, Register:i32 %R0, Register:i32 %R1, t19:1

but the `t19: ch,glue = CopyToReg t17, Register:i32 %R1, t28:1, t17:1` ends being emitted as `mrs r1, apsr`, below is the final code generated by `llc -mtriple=armv6t2-eabi` which I think is really wrong as the apsr register has more bits than the carry.

  f3:
          adds    r0, r0, r1
          adcs    r0, r1, #0
          mrs     r1, apsr
          bx      lr

So my understanding is that using these nodes, while something that has to be done in the future, is not an easy addition to this change.

The upside is that it is possible to express the original change of this patch using the `ADDCARRY` and `SUBCARRY` nodes, so this may still be worth the effort.

Do you agree with my analysis or I am missing something very obvious here. I can upload  a phab with the current change as it is if it helps.

Thank you very much,


https://reviews.llvm.org/D34515





More information about the llvm-commits mailing list