[PATCH] D78206: [Target][ARM] Make Low Overhead Loops coexist with VPT blocks

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 15 07:05:21 PDT 2020


Pierre-vh created this revision.
Pierre-vh added reviewers: dmgreen, samparker, SjoerdMeijer.
Herald added subscribers: llvm-commits, danielkiss, hiraditya, kristof.beyls.
Herald added a project: LLVM.
Pierre-vh added a parent revision: D78201: [Target][ARM] Replace outdated getARMVPTBlockMask function.

Previously, the LowOverheadLoops pass couldn't handle VPT blocks that used the `vpt` instruction, or loops containing multiple identical VCTPs.
This patch improves the LowOverheadLoops pass so it can handle those cases

I'm still unsure about the changes in this patch, so comments/suggestions are welcome.

This patch will also need a follow-up ARMTargetTransformInfo change to work because the TTI, in its current state, won't allow the vectorizer to do tail-predication for loops bigger than 1 basic block, and loops containing compare instructions, and, as VPT blocks are generated from comparisons (which create the predicate), they never make it to this pass in the current state of things.

However, with the right changes to the TTI and the right compiler options, you can generate this kind of code with these changes:

  // C++
  void test(int* A, int n, int x)  {             
      for(int i = 0; i < n; i++)  
        if (A[i] < x && A[i] > -x)
          A[i] = 0;               
  }
  // assembly
  	dlstp.32	lr, r1
  .LBB0_1:                                @ %vector.body
                                          @ =>This Inner Loop Header: Depth=1
  	vldrw.u32	q1, [r0]
  	vptt.s32	lt, q1, r2
  	vcmpt.s32	gt, q1, r3
  	vstrwt.32	q0, [r0], #16
  	letp	lr, .LBB0_1


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D78206

Files:
  llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
  llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/cond-vector-reduce-mve-codegen.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-in-vpt-2.mir
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/vpt-blocks.mir

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D78206.257701.patch
Type: text/x-patch
Size: 23246 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200415/1878d9fa/attachment.bin>


More information about the llvm-commits mailing list