[PATCH] D55720: [Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic

Thu Mar 7 17:11:47 PST 2019

leonardchan added inline comments.

================
Comment at: llvm/test/CodeGen/X86/smul_fix_sat.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s --check-prefix=X64
+; RUN: llc < %s -mtriple=i686 -mattr=cmov | FileCheck %s --check-prefix=X86
----------------
bjope wrote:
> leonardchan wrote:
> > bjope wrote:
> > > leonardchan wrote:
> > > > leonardchan wrote:
> > > > > bjope wrote:
> > > > > > The expansion is quite complicated (the splitting in four parts and detecting overflow etc).
> > > > > > Isn't there a risk that X86 is a typical target that will try to find more optimal solutions and maybe also make SMULFIXSAT legal? Then this test case might not really verify the expand code any longer?
> > > > > > 
> > > > > > On the other hand, these test cases are just jibberish to me anyway. I can't tell from looking at the checks that DAGTypeLegalizer::ExpandIntRes_MULFIX is doing the right thing. And it would not really help if using another target. Are there perhaps other ways to test DAGTypeLegalizer, such as unit tests?
> > > > > > 
> > > > > > One thing that probably can be done quite easily is to a bunch of tests using constant operands. Verifying that DAGCombiner will constant fold to the expected result after having expanded into legal operations (somehow making sure that DAGCombiner do not constant fold the SMULFIXSAT before it has been expanded, I guess someone will add such DAGCombines sooner or later). That way you might be able to get coverage for all paths through DAGTypeLegalizer::ExpandIntRes_MULFIX. Maybe this test should be in a separate test file.
> > > > > Yeah I can see how these tests are hard to read. I wasn't aware of other ways this could be tested other than making sure the codegen is the same each time. I have my own scripts with different cases to verify the output is correct, but wasn't sure of any existing widely used method of "taking my IR, running it, and verify the results".
> > > > > 
> > > > > Testing with constant operands seems to produce better looking tests for non-saturating multiplication:
> > > > > 
> > > > > ```
> > > > > define i4 @func() {
> > > > > ; X64-LABEL: func:
> > > > > ; X64:       # %bb.0:
> > > > > ; X64-NEXT:    movb $3, %al 
> > > > > ; X64-NEXT:    retq
> > > > >   %tmp = call i4 @llvm.smul.fix.i4( i4 3, i4 2 , i32 1)
> > > > >   ret i4 %tmp
> > > > > } 
> > > > > ```
> > > > > 
> > > > > where we can immediately tell the result is 3, but there's still branching in the saturating case:
> > > > > 
> > > > > ```
> > > > > define i4 @func2() {
> > > > > ; X64-LABEL: func2:
> > > > > ; X64:       # %bb.0:
> > > > > ; X64-NEXT:    xorl %eax, %eax
> > > > > ; X64-NEXT:    testb %al, %al 
> > > > > ; X64-NEXT:    movb $127, %cl 
> > > > > ; X64-NEXT:    jg .LBB1_2
> > > > > ; X64-NEXT:  # %bb.1:
> > > > > ; X64-NEXT:    movb $3, %cl 
> > > > > ; X64-NEXT:  .LBB1_2:
> > > > > ; X64-NEXT:    movb $-1, %al 
> > > > > ; X64-NEXT:    negb %al 
> > > > > ; X64-NEXT:    movb $-128, %al 
> > > > > ; X64-NEXT:    jl .LBB1_4
> > > > > ; X64-NEXT:  # %bb.3:
> > > > > ; X64-NEXT:    movl %ecx, %eax
> > > > > ; X64-NEXT:  .LBB1_4:
> > > > > ; X64-NEXT:    retq
> > > > >   %tmp = call i4 @llvm.smul.fix.sat.i4( i4 3, i4 2 , i32 1)
> > > > >   ret i4 %tmp
> > > > > }
> > > > > ```
> > > > > 
> > > > > so we can't get something as straightforward as with non-saturating.
> > > > @bjope I added another test file that covers the saturation branches in ExpandIntRes_MULFIX using constant operands, although this doesn't seem to produce anything more readable than with variable operands.
> > > Maybe it doesn't fold due to lack of constant folding for SMUL_LOHI (at least not for x86). What a pity.
> > > 
> > > I tried running the test using -mtriple=x86_64--, that at least produce code that is easier to map to the expansion.
> > > 
> > > I also tried some other targets:
> > >   -mtriple=ppc32  => looks like we get some constant folding here
> > >   -mtriple=ppc64 => asserts in llvm::SelectionDAG::transferDbgValues
> > >   -mtriple=hexagon => asserts in llvm::SelectionDAG::transferDbgValues
> > >   -mtriple=systemz => asserts in llvm::SelectionDAG::transferDbgValues
> > >   -mtriple=sparc => LLVM ERROR: Cannot select: t42: i32,i32 = addcarry t41:1, Constant:i32<0>, t90:1
> > > 
> > > (FWIW, no idea if the asserts and LLVM ERROR actually is related to your patch)
> > Updated the test to use `-mtriple=x86_64-linux`and it looks a lot more readable.
> Have you looked at the problem with asserts in llvm::SelectionDAG::transferDbgValues?
> It happens when expanding smulfixsat, so something seems to be broken regarding the legalization (depending on target used).
> 
- For `addcarry`, the problem seems to be that `ISD::ADDCARRY` is not supported on some 32 bit targets. The fix for this is just a check in `expandMUL_LOHI` to see if this operation is legal (https://reviews.llvm.org/D59119).
- For `llvm::SelectionDAG::transferDbgValues`, this is because `expandFixedPointMul` returns an empty `SDValue()` to indicate this function failed due to some unsupported operation (most likely `ISD::SMULO`). I imagine the simplest solution for this is to just `report_fatal_error` since we do not have other operations we can use to perform saturation multiplication.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55720/new/

https://reviews.llvm.org/D55720