[PATCH] D78994: [Target][ARM] Add a fix for an LSR Pattern that can't be tail-predicated
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 28 04:47:21 PDT 2020
Pierre-vh created this revision.
Pierre-vh added reviewers: samparker, dmgreen, SjoerdMeijer.
Herald added subscribers: llvm-commits, danielkiss, hiraditya, kristof.beyls.
Herald added a project: LLVM.
The LSR pass can generate an undesired pattern which hurts tail predication in some cases.
However, fixing LSR directly isn't easy and could break other targets, so instead of changing LSR, we decided to fix it in the MVETailPredicationPass.
This patch improves the MVETailPredication pass so it can detect the undesirable pattern and rewrite it in a tail-predication-friendly form.
Here is an example of the IR that LSR can generate
loopbody:
%lsr.iv = phi i32 [ %lsr.iv.next, %loopbody ], [ %42, %pred ]
%44 = add i32 %lsr.iv, -4
%45 = call <4 x i1> @llvm.arm.mve.vctp32(i32 %44) #5
; ... etc
%lsr.iv.next = add nsw i32 %lsr.iv, -4
That can't be tail-predicated because the VCTP's operand is defined inside the loop, so this patch will rewrite it like this:
pred:
%44 = add i32 %42, -4
loopbody:
%lsr.iv = phi i32 [ %lsr.iv.next, %loopbody ], [ %42, %pred ]
%lsr.fixed = phi i32 [ %lsr.iv.next, %loopbody ], [ %44, %pred ]
%45 = call <4 x i1> @llvm.arm.mve.vctp32(i32 %lsr.fixed)
; ... etc
%lsr.iv.next = add nsw i32 %lsr.iv, -4
That IR is functionally the same as before, but causes no issue for tail-predication.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D78994
Files:
llvm/lib/Target/ARM/MVETailPredication.cpp
llvm/test/CodeGen/Thumb2/LowOverheadLoops/mve-tp-lsr-patterns.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D78994.260589.patch
Type: text/x-patch
Size: 11640 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200428/0ed69e19/attachment.bin>
More information about the llvm-commits
mailing list