[PATCH] Fix urem instruction with power of two to compute known bit.

Fri May 30 08:08:58 PDT 2014

Sorry, I missed this.

I committed the patch (r209902 ). I just updated the test to check for
assembly (as opposed to just generating it).

Normally in LLVM we test that the result is valid, not exactly the
steps that a transformation is doing. Doing that gives more freedom at
modifying the passes in the future.

We do try to write small tests and test the smallest part of llvm that
we can. It is really unfortunate with codegen that such a large piece
of code needs to be run, but I don't think that checking deubg output
is the correct way to fix that.

On 30 May 2014 01:12, Andrey Kuharev <an.kuharev at gmail.com> wrote:
> Changed patch according to changed trunk
>
> http://reviews.llvm.org/D3823
>
> Files:
>   lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>   test/CodeGen/X86/computeKnownBits_urem.ll
>
> Index: lib/CodeGen/SelectionDAG/SelectionDAG.cpp
> ===================================================================
> --- lib/CodeGen/SelectionDAG/SelectionDAG.cpp
> +++ lib/CodeGen/SelectionDAG/SelectionDAG.cpp
> @@ -2185,8 +2185,11 @@
>        const APInt &RA = Rem->getAPIntValue();
>        if (RA.isPowerOf2()) {
>          APInt LowBits = (RA - 1);
> -        KnownZero |= ~LowBits;
> -        computeKnownBits(Op.getOperand(0), KnownZero, KnownOne,Depth+1);
> +        computeKnownBits(Op.getOperand(0), KnownZero2, KnownOne2, Depth + 1);
> +
> +        // The upper bits are all zero, the lower ones are unchanged.
> +        KnownZero = KnownZero2 | ~LowBits;
> +        KnownOne = KnownOne2 & LowBits;
>          break;
>        }
>      }
> Index: test/CodeGen/X86/computeKnownBits_urem.ll
> ===================================================================
> --- test/CodeGen/X86/computeKnownBits_urem.ll
> +++ test/CodeGen/X86/computeKnownBits_urem.ll
> @@ -0,0 +1,14 @@
> +; RUN: llc -debug -mtriple=i386-pc-win32 %s -o /dev/null 2>&1 | FileCheck %s
> +define i32 @main() #0 {
> +entry:
> +  %a = alloca i32, align 4
> +  store i32 1, i32* %a, align 4
> +  %0 = load i32* %a, align 4
> +  %or = or i32 1, %0
> +  %and = and i32 1, %or
> +  %rem = urem i32 %and, 1
> +  %add = add i32 %rem, 1
> +  ret i32 %add
> +}
> +; CHECK: Replacing.2 0x{{[0-9a-f]+}}: i32 = urem 0x{{[0-9a-f]+}}, 0x{{[0-9a-f]+}} [ORD=7]
> +; CHECK: With: 0x{{[0-9a-f]+}}: i32 = Constant<0>