[LLVMdev] SelectionDAGBuilder doing bad things on certain architectures
Villmow, Micah
Micah.Villmow at amd.com
Thu Jul 22 15:05:43 PDT 2010
The selection dag builder has an 'optimization' added into the visitBr function which makes assumptions that are not valid on all architectures. The problem is this.
The following function
kernel void cf_test(global int* a, int b, int c, int e)
{
int d = 0;
if (!b && c < e) {
d = a + b;
}
*a = d;
}
Is transformed into something equivalent to this:
Kernel void cf_test(global int* a, int b, int c, int e)
{
Int d;
If (b) {
d = 0;
} else {
if (c < e) {
d = a + b;
} else {
d = 0;
}
}
*a = d;
}
by the visitBr code found in SelectionDAGBuilder::visitBr():1188.
However, if jumps are expensive or jumps are not supported and high level flow control needs to be reconstructed. This is extremely inefficient. For example on AMD GPU's, a single flow control instruction can take 40 cycles to execute, but an bit instruction, can be executed every cycle. So obviously the assumptions made by this block of code are inefficient on AMD hardware. Increasing control flow has a direct impact on performance and removing the extra 'and' or 'or' in order to short circuit the conditional evaluation does not work for our target.
So in order to make this type of instruction rely more on target specific information. I've added a new Boolean to the TargetLoweringInfo class called JumpIsExpensive along with accessor functions.
Please review the patch and apply if acceptable.
Thanks,
Micah
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100722/4c120326/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jump_boolean.patch
Type: application/octet-stream
Size: 3333 bytes
Desc: jump_boolean.patch
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100722/4c120326/attachment.obj>
More information about the llvm-dev
mailing list