[llvm-commits] Bad COPY removal in register coalescer

Mon Oct 29 09:34:12 PDT 2012

Hi [Jakob],

I've found a bug in the Register Coalescer, but am not really sure how
to fix it. Hopefully you can help!

The short version is that the coalescer decides to coalesce (via
adjustCopiesBackFrom()) a COPY that it shouldn't.

The copy is this:

  %vreg31<def> = COPY %vreg24; QPR:%vreg31,%vreg24

The code calls isCoalescable() with this argument:

  %vreg24:dsub_0<def> = COPY %vreg31:dsub_0; QPR:%vreg24,%vreg31

... which returns true.

Unless I'm being daft, it can't coalesce these copies, because the other
subregister (dsub_1) is not defined by the ACopyMI. Indeed, this causes
a codegen fault where a dsub register is clobbered.

I *think* the code at RegisterCoalescer:354 should probably handle it:

  } else {
    // DstReg is virtual.
    if (DstReg != Dst)
      return false;
    // Registers match, do the subregisters line up?
    return compose(TRI, SrcIdx, SrcSub) == compose(TRI, DstIdx, DstSub);
  }

But those compositions don't sit quite right with me and I don't really
know how I should be reading them to make them make sense, even after
reading the documentation for TRI::composeSubRegIndices many times.

I'm attaching the testcase I'm using, if it helps - the machine code
goes bad (a copy that shouldn't gets removed) just after the second call
to adjustCopiesBackFrom().

[The intended action of the attachment is to perform a signed to
unsigned conversion with saturation.]

Cheers,

James
-------------- next part --------------
define float @f(<4 x float>* %pinput) {
  %input = load <4 x float>* %pinput, align 16
  %e0 = extractelement <4 x float> %input, i32 0
  %e1 = extractelement <4 x float> %input, i32 1
  %e2 = extractelement <4 x float> %input, i32 2
  %e3 = extractelement <4 x float> %input, i32 3
  %sum0 = fadd float %e0, %e1
  %sum1 = fadd float %sum0, %e2
  %sum2 = fadd float %sum1, %e3

  ret float %sum2
}