[LLVMbugs] [Bug 2247] New: llvm needs an 'exact divide' flag/instruction

Mon Apr 21 21:32:30 PDT 2008

http://llvm.org/bugs/show_bug.cgi?id=2247

           Summary: llvm needs an 'exact divide' flag/instruction
           Product: libraries
           Version: 1.0
          Platform: PC
        OS/Version: All
            Status: NEW
          Keywords: code-quality, new-feature
          Severity: enhancement
          Priority: P2
         Component: Core LLVM classes
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: sabre at nondot.org
                CC: llvmbugs at cs.uiuc.edu

We need to add a flag of some sort to the sdiv instruction, indicating that the
division is exact: there is known to be no remainder.  This allows us to turn
things like "x /s 4" into "x >>s 2" for example, and would allow us to
eliminate special case handling code of pointer subtraction (which detects
power of two sizes and has a special case) and move this into the optimizer
where it should be.

In addition to cleaning up the front-ends, this would also improve
optimization.  For example, consider these two functions with identical
semantics:

int test(double *s, double *e) {
  return (e-s) > 1;
}
int test2(double *s, double *e) {
  long diff = e-s;
  return diff > 1;
}

Pointer subtraction is codegen'd as the difference between the two pointers
with a signed divide by the element size.

Amusingly both GCC and llvm-gcc generate exactly the same code for these two
functions (instruction for instruction).  We produce:

_test:
        movl    8(%esp), %eax
        subl    4(%esp), %eax
        cmpl    $15, %eax
        setg    %al
        movzbl  %al, %eax
        ret
_test2:
        movl    8(%esp), %eax
        subl    4(%esp), %eax
        sarl    $3, %eax
        cmpl    $1, %eax
        setg    %al
        movzbl  %al, %eax
        ret

Note that the first has significantly better codegen.  This comes from this IR:

define i32 @test(double* %s, double* %e) nounwind  {
        %tmp12 = ptrtoint double* %e to i32             ; <i32> [#uses=1]
        %tmp34 = ptrtoint double* %s to i32             ; <i32> [#uses=1]
        %tmp5 = sub i32 %tmp12, %tmp34          ; <i32> [#uses=1]
        %tmp6 = icmp sgt i32 %tmp5, 15          ; <i1> [#uses=1]
        %tmp67 = zext i1 %tmp6 to i32           ; <i32> [#uses=1]
        ret i32 %tmp67
}
define i32 @test2(double* %s, double* %e) nounwind  {
        %tmp12 = ptrtoint double* %e to i32             ; <i32> [#uses=1]
        %tmp34 = ptrtoint double* %s to i32             ; <i32> [#uses=1]
        %tmp5 = sub i32 %tmp12, %tmp34          ; <i32> [#uses=1]
        %tmp6 = ashr i32 %tmp5, 3               ; <i32> [#uses=1]
        %tmp8 = icmp sgt i32 %tmp6, 1           ; <i1> [#uses=1]
        %tmp89 = zext i1 %tmp8 to i32           ; <i32> [#uses=1]
        ret i32 %tmp89
}

The critical difference here is that GCC's fold() routine merges the divide
(aka ashr) into the comparison in the first case.  GCC does this because it
operates on instruction trees which are expression based and don't see across
statements.

In this case, the transformation is *not safe* to perform for arbitrary
pointers and an arbitrary ashr in llvm IR.  We have to know that the divide (in
this case by 8) is always exact (which is guaranteed for pointer arithmetic) to
know that that xform is safe to perform on llvm ir.  Knowing this, we could
handle both cases uniformly and remove reliance on the GCC f.e. doing this
xform.

This optimization is important, because the GCC front-end misses many other
cases as well, such as when the pointers are wrapped in iterators.  Getting
this requires SROA to happen before the pointer subtraction is exposed.

-Chris

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.