[PATCH] D109631: [HardwareLoops] Loop guard intrinsic to recognise zext
Sherwin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 10 14:03:11 PDT 2021
sherwin-dc created this revision.
sherwin-dc added reviewers: samparker, SjoerdMeijer, dmgreen.
Herald added subscribers: javed.absar, hiraditya, kristof.beyls.
sherwin-dc requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
I am working on a downstream custom backend that uses the loop guard intrinsic.
If a loop count was initially represented by a 32b unsigned int in C then the hardware-loop pass can recognise the loop guard and insert the llvm.test.set.loop.iterations intrinsic. However if this was instead a unsigned short/char then clang inserts a `zext` instruction to expand the loop count to an i32.
This means the loop guard intrinsic would not be used as shown here because `%x` was being compared with 0 in the loop guard and the hardware loop pass was expecting `%0` to be compared.
entry:
%tobool.not5 = icmp eq i8 %x, 0
%0 = zext i8 %x to i32
br i1 %tobool.not5, label %while.end, label %while.body.preheader
while.body.preheader: ; preds = %entry
%1 = zext i8 %x to i32
%scevgep = getelementptr i32, i32* %arr, i32 %1
call void @llvm.set.loop.iterations.i32(i32 %0)
br label %while.body
This patch would benefit other backends as well that make use of the loop guard intrinsic. For example, if we have a loop:
void loop(unsigned *a, unsigned *b, unsigned length)
{
for(int i = 0; i < length; i++)
{
a[i] = b[i];
}
}
and compile it to IR we get `int_loop.ll`. After changing `unsigned length` to `unsigned short length` and compiling to IR we have `short_loop.ll`.
F18964591: int_loop.ll <https://reviews.llvm.org/F18964591>
F18964592: short_loop.ll <https://reviews.llvm.org/F18964592>
Using ARM as an example (since it makes use of the hardware loop guard intrinsic), compiling as `llc -mtriple=thumbv8.1m.main-none-none-eabi -mattr=+lob,+mve.fp int_loop.ll -o int_loop.asm` we get `int_loop.asm` and `short_loop.asm` respectively.
F18964746: int_loop.asm <https://reviews.llvm.org/F18964746>
F18964747: short_loop.asm <https://reviews.llvm.org/F18964747>
The int_loop.ll example compiles to a hardware loop with ARM's WLS instruction, while short_loop.ll does not and includes a loop guard. With this patch the short_loop.ll example also compiles to a WLS loop since the loop guard intrinsic is used.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D109631
Files:
llvm/lib/CodeGen/HardwareLoops.cpp
Index: llvm/lib/CodeGen/HardwareLoops.cpp
===================================================================
--- llvm/lib/CodeGen/HardwareLoops.cpp
+++ llvm/lib/CodeGen/HardwareLoops.cpp
@@ -365,7 +365,13 @@
return false;
};
- if (!IsCompareZero(ICmp, Count, 0) && !IsCompareZero(ICmp, Count, 1))
+ // Check if Count is a zext.
+ Value *CountBefZext =
+ isa<ZExtInst>(Count) ? cast<ZExtInst>(Count)->getOperand(0) : nullptr;
+
+ if (!IsCompareZero(ICmp, Count, 0) && !IsCompareZero(ICmp, Count, 1) &&
+ !IsCompareZero(ICmp, CountBefZext, 0) &&
+ !IsCompareZero(ICmp, CountBefZext, 1))
return false;
unsigned SuccIdx = ICmp->getPredicate() == ICmpInst::ICMP_NE ? 0 : 1;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D109631.371994.patch
Type: text/x-patch
Size: 709 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210910/6908fc58/attachment.bin>
More information about the llvm-commits
mailing list