[LLVMbugs] [Bug 2247] New: llvm needs an 'exact divide' flag/instruction
bugzilla-daemon at cs.uiuc.edu
bugzilla-daemon at cs.uiuc.edu
Mon Apr 21 21:32:30 PDT 2008
http://llvm.org/bugs/show_bug.cgi?id=2247
Summary: llvm needs an 'exact divide' flag/instruction
Product: libraries
Version: 1.0
Platform: PC
OS/Version: All
Status: NEW
Keywords: code-quality, new-feature
Severity: enhancement
Priority: P2
Component: Core LLVM classes
AssignedTo: unassignedbugs at nondot.org
ReportedBy: sabre at nondot.org
CC: llvmbugs at cs.uiuc.edu
We need to add a flag of some sort to the sdiv instruction, indicating that the
division is exact: there is known to be no remainder. This allows us to turn
things like "x /s 4" into "x >>s 2" for example, and would allow us to
eliminate special case handling code of pointer subtraction (which detects
power of two sizes and has a special case) and move this into the optimizer
where it should be.
In addition to cleaning up the front-ends, this would also improve
optimization. For example, consider these two functions with identical
semantics:
int test(double *s, double *e) {
return (e-s) > 1;
}
int test2(double *s, double *e) {
long diff = e-s;
return diff > 1;
}
Pointer subtraction is codegen'd as the difference between the two pointers
with a signed divide by the element size.
Amusingly both GCC and llvm-gcc generate exactly the same code for these two
functions (instruction for instruction). We produce:
_test:
movl 8(%esp), %eax
subl 4(%esp), %eax
cmpl $15, %eax
setg %al
movzbl %al, %eax
ret
_test2:
movl 8(%esp), %eax
subl 4(%esp), %eax
sarl $3, %eax
cmpl $1, %eax
setg %al
movzbl %al, %eax
ret
Note that the first has significantly better codegen. This comes from this IR:
define i32 @test(double* %s, double* %e) nounwind {
%tmp12 = ptrtoint double* %e to i32 ; <i32> [#uses=1]
%tmp34 = ptrtoint double* %s to i32 ; <i32> [#uses=1]
%tmp5 = sub i32 %tmp12, %tmp34 ; <i32> [#uses=1]
%tmp6 = icmp sgt i32 %tmp5, 15 ; <i1> [#uses=1]
%tmp67 = zext i1 %tmp6 to i32 ; <i32> [#uses=1]
ret i32 %tmp67
}
define i32 @test2(double* %s, double* %e) nounwind {
%tmp12 = ptrtoint double* %e to i32 ; <i32> [#uses=1]
%tmp34 = ptrtoint double* %s to i32 ; <i32> [#uses=1]
%tmp5 = sub i32 %tmp12, %tmp34 ; <i32> [#uses=1]
%tmp6 = ashr i32 %tmp5, 3 ; <i32> [#uses=1]
%tmp8 = icmp sgt i32 %tmp6, 1 ; <i1> [#uses=1]
%tmp89 = zext i1 %tmp8 to i32 ; <i32> [#uses=1]
ret i32 %tmp89
}
The critical difference here is that GCC's fold() routine merges the divide
(aka ashr) into the comparison in the first case. GCC does this because it
operates on instruction trees which are expression based and don't see across
statements.
In this case, the transformation is *not safe* to perform for arbitrary
pointers and an arbitrary ashr in llvm IR. We have to know that the divide (in
this case by 8) is always exact (which is guaranteed for pointer arithmetic) to
know that that xform is safe to perform on llvm ir. Knowing this, we could
handle both cases uniformly and remove reliance on the GCC f.e. doing this
xform.
This optimization is important, because the GCC front-end misses many other
cases as well, such as when the pointers are wrapped in iterators. Getting
this requires SROA to happen before the pointer subtraction is exposed.
-Chris
--
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the llvm-bugs
mailing list