[PATCH] D50310: Improve the legalisation lowering of UMULO

Wed Aug 8 13:42:42 PDT 2018

nagisa added a comment.

In https://reviews.llvm.org/D50310#1192833, @efriedma wrote:

> Please choose a different name for the tests; "muloti2.ll" isn't usefully indicating what the files actually test.

Does something like `umulo-legalisation-lowering.ll` sound good?

In https://reviews.llvm.org/D50310#1192833, @efriedma wrote:

> > Correctness of the algorithm was verified by exhaustively checking the output of this algorithm for overflowing multiplication of 16 bit integers against an obviously correct widening multiplication.
>
> What exactly did you verify?  Looking again, I'm pretty sure your example IR doesn't compute the correct result: you never compute the product %LHS.HI * %RHS.HI.

The computation of `%LHS.HI * %RHS.HI` is only necessary to compute the overflow bit. If multiplication was used, the check would look like this:

  %product = { iNh, i1 } @umul.with.overflow.iNh(iNh %LHS.HI, iNh %RHS.HI)
  %0 = product.0 != 0 || product.1 ; equivalent of the current %0

The `%0 = %LHS.HI != 0 && %RHS.HI != 0` is an optimisation that calculates the same information without doing the multiply. Computing the product may be more efficient for some specific bitwidths on some targets, but I found the `%LHS.HI != 0 && %RHS.HI != 0` variant to be more palatable in the general case.

---

The following Rust code is what I used to check the correctness of current algorithm exhaustively in a reasonable time-frame. I’m willing to port this test to C if there’s a demand for it.

  type Half = u8;
  type Full = u16;
  type Double = u32;

  const HALF_BITS: u32 = 8;
  const FULL_BITS: u32 = 16;

  pub fn obviously_correct(l: Full, r: Full) -> (Full, bool) {
      // Do a widening multiplication and check the high half to see if the multiplication
      // overflowed. Also correctly handles result wrapping in case of overflow.
      let doublewide = (l as Double).wrapping_mul(r as Double);
      (doublewide as Full, (doublewide >> FULL_BITS) != 0)
  }

  pub fn actual_implementation(l: Full, r: Full) -> (Full, bool) {
      let (lhs_lo, rhs_lo) = (l as Half, r as Half);
      let (lhs_hi, rhs_hi) = ((l >> HALF_BITS) as Half, (r >> HALF_BITS) as Half);

      let overflow0 = lhs_hi != 0 && rhs_hi != 0;
      let (r1, overflow1) = lhs_hi.overflowing_mul(rhs_lo);
      let (r2, overflow2) = rhs_hi.overflowing_mul(lhs_lo);
      let r3 = (lhs_lo as Full).wrapping_mul(rhs_lo as Full);
      let r4 = ((r1 as Full) << HALF_BITS).wrapping_add((r2 as Full) << HALF_BITS);
      let (r5, overflow5) = r4.overflowing_add(r3);
      (r5, overflow0 || overflow1 || overflow2 || overflow5)
  }

  pub fn main() {
      for lhs in Full::min_value()..=Full::max_value() {
          for rhs in Full::min_value()..=Full::max_value() {
              assert_eq!(obviously_correct(lhs, rhs), actual_implementation(lhs, rhs),
                         "results did not match for lhs={}, rhs={}", lhs, rhs);
          }
      }
  }

https://reviews.llvm.org/D50310