[PATCH] D109631: [HardwareLoops] Loop guard intrinsic to recognise zext

Sherwin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 10 14:03:11 PDT 2021


sherwin-dc created this revision.
sherwin-dc added reviewers: samparker, SjoerdMeijer, dmgreen.
Herald added subscribers: javed.absar, hiraditya, kristof.beyls.
sherwin-dc requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

I am working on a downstream custom backend that uses the loop guard intrinsic.

If a loop count was initially represented by a 32b unsigned int in C then the hardware-loop pass can recognise the loop guard and insert the llvm.test.set.loop.iterations intrinsic. However if this was instead a unsigned short/char then clang inserts a `zext` instruction to expand the loop count to an i32.

This means the loop guard intrinsic would not be used as shown here because `%x` was being compared with 0 in the loop guard and the hardware loop pass was expecting `%0` to be compared.

  entry:
    %tobool.not5 = icmp eq i8 %x, 0
    %0 = zext i8 %x to i32
    br i1 %tobool.not5, label %while.end, label %while.body.preheader
  
  while.body.preheader:                             ; preds = %entry
    %1 = zext i8 %x to i32
    %scevgep = getelementptr i32, i32* %arr, i32 %1
    call void @llvm.set.loop.iterations.i32(i32 %0)
    br label %while.body

This patch would benefit other backends as well that make use of the loop guard intrinsic. For example, if we have a loop:

  void loop(unsigned *a, unsigned *b, unsigned length)
  {
      for(int i = 0; i < length; i++)
      {
          a[i] = b[i];
      }
  }

and compile it to IR we get `int_loop.ll`. After changing `unsigned length` to `unsigned short length` and compiling to IR we have `short_loop.ll`.

F18964591: int_loop.ll <https://reviews.llvm.org/F18964591>
F18964592: short_loop.ll <https://reviews.llvm.org/F18964592>

Using ARM as an example (since it makes use of the hardware loop guard intrinsic), compiling as `llc -mtriple=thumbv8.1m.main-none-none-eabi -mattr=+lob,+mve.fp int_loop.ll -o int_loop.asm` we get `int_loop.asm` and `short_loop.asm` respectively.

F18964746: int_loop.asm <https://reviews.llvm.org/F18964746>
F18964747: short_loop.asm <https://reviews.llvm.org/F18964747>

The int_loop.ll example compiles to a hardware loop with ARM's WLS instruction, while short_loop.ll does not and includes a loop guard. With this patch the short_loop.ll example also compiles to a WLS loop since the loop guard intrinsic is used.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D109631

Files:
  llvm/lib/CodeGen/HardwareLoops.cpp


Index: llvm/lib/CodeGen/HardwareLoops.cpp
===================================================================
--- llvm/lib/CodeGen/HardwareLoops.cpp
+++ llvm/lib/CodeGen/HardwareLoops.cpp
@@ -365,7 +365,13 @@
     return false;
   };
 
-  if (!IsCompareZero(ICmp, Count, 0) && !IsCompareZero(ICmp, Count, 1))
+  // Check if Count is a zext.
+  Value *CountBefZext =
+      isa<ZExtInst>(Count) ? cast<ZExtInst>(Count)->getOperand(0) : nullptr;
+
+  if (!IsCompareZero(ICmp, Count, 0) && !IsCompareZero(ICmp, Count, 1) &&
+      !IsCompareZero(ICmp, CountBefZext, 0) &&
+      !IsCompareZero(ICmp, CountBefZext, 1))
     return false;
 
   unsigned SuccIdx = ICmp->getPredicate() == ICmpInst::ICMP_NE ? 0 : 1;


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D109631.371994.patch
Type: text/x-patch
Size: 709 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210910/6908fc58/attachment.bin>


More information about the llvm-commits mailing list