[PATCH] D18962: [SystemZ] README: remove an implemented idea, add some new ones.

Mon Apr 11 07:13:02 PDT 2016

koriakin added a comment.

In http://reviews.llvm.org/D18962#396965, @uweigand wrote:

> Agreed on the trap instruction.
>
> As to overflow, there is already this text in README.txt:
>
> > ADD LOGICAL WITH SIGNED IMMEDIATE could be useful when we need to
>
> >  produce a carry.  SUBTRACT LOGICAL IMMEDIATE could be useful when we
>
> >  need to produce a borrow.  (Note that there are no memory forms of
>
> >  ADD LOGICAL WITH CARRY and SUBTRACT LOGICAL WITH BORROW, so the high
>
> >  part of 128-bit memory operations would probably need to be done
>
> >  via a register.)
>
>
> Does this cover what you're refering to?   In any case, this should probably be merged there.

No - this is about carry, not overflow.  By overflow I mean signed overflow used eg. by -ftrapv:

  int f(int a, int b) {
          return a + b;
  }

Compiles with -ftrapv to:

  define signext i32 @f(i32 signext %a, i32 signext %b) #0 {
  entry:
    %0 = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
    %1 = extractvalue { i32, i1 } %0, 1
    br i1 %1, label %trap, label %cont

  trap:                                             ; preds = %entry
    tail call void @llvm.trap() #2
    unreachable

  cont:                                             ; preds = %entry
    %2 = extractvalue { i32, i1 } %0, 0
    ret i32 %2
  }

Which compiles to this monstrosity:

  f:                                      # @f
  # BB#0:                                 # %entry
          stmg    %r14, %r15, 112(%r15)
          aghi    %r15, -160
          chi     %r3, 0
          ipm     %r0
          xilf    %r0, 4294967295
          risbg   %r0, %r0, 63, 191, 36
          chi     %r2, 0
          ipm     %r1
          xilf    %r1, 4294967295
          risbg   %r1, %r1, 63, 191, 36
          cr      %r1, %r0
          ipm     %r0
          afi     %r0, -268435456
          ar      %r2, %r3
          chi     %r2, 0
          ipm     %r3
          xilf    %r3, 4294967295
          risbg   %r3, %r3, 63, 191, 36
          cr      %r1, %r3
          ipm     %r1
          afi     %r1, 1879048192
          nr      %r1, %r0
          srl     %r1, 31
          cije    %r1, 1, .LBB0_2
  # BB#1:                                 # %cont
          lgfr    %r2, %r2
          lmg     %r14, %r15, 272(%r15)
          br      %r14
  .LBB0_2:                                # %trap
          brasl   %r14, abort at PLT
  .Lfunc_end0:

While it could be like this:

  f:
      stmg    %r14, %r15, 112(%r15)
      ar      %r2,%r3
      jo       .Lfail
      lgfr    %r2,%r2
      lmg     %r14, %r15, 272(%r15)
      br      %r14
  .Lfail:
      brasl   %r14, abort at PLT

Combined with trap support, we could get that down to 4 instructions (ar, jo .+2, lgfr, br).

> As to SRDL etc., I think you're refering to 128-bit shifts?  These are just one instance of a more general problem: the back-end currently does not handle i128 *at all*, it is marked as illegal type.  At some point, we should probably make i128 legal, and add optimal code gen for all the operations on that type, including shifts, but also the rest of them.  (In particular, on z13 we should also use vector instructions e.g. for 128-bit add, subtract, shift.)  This would also be a pre-req for implementing the 16-byte atomics that are mentioned a couple of lines earlier in README.txt.

Oops, nevermind... I've just seen the instructions are actually 32-bit only (and the double refers to 64-bit), so rather useless on a 64-bit target.  I'll remove this entry.

Repository:
  rL LLVM

http://reviews.llvm.org/D18962