[llvm-bugs] [Bug 49347] New: Memory access versioning adds bad(?) runtime predicate to vectorized loop
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Feb 25 03:30:12 PST 2021
https://bugs.llvm.org/show_bug.cgi?id=49347
Bug ID: 49347
Summary: Memory access versioning adds bad(?) runtime predicate
to vectorized loop
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Loop Optimizer
Assignee: unassignedbugs at nondot.org
Reporter: mattias.v.eriksson at ericsson.com
CC: llvm-bugs at lists.llvm.org
Created attachment 24571
--> https://bugs.llvm.org/attachment.cgi?id=24571&action=edit
LV input
With the attached file, loop vectorization adds a runtime check so the the
vectorized loop only runs when that numOutputs == 1:
opt -S -o - lv-mav.ll -loop-vectorize -force-vector-width=4
[...]
%ident.check = icmp ne i32 %numOutputs, 1
%10 = or i1 %9, %ident.check
[...]
%17 = or i1 %10, %16
br i1 %17, label %scalar.ph, label %vector.ph
Running the vectorizer without memory access versioning, I get a partially
vectorized loop without the check on numOutputs:
opt -S -o - lv-mav.ll -loop-vectorize -force-vector-width=4
-enable-mem-access-versioning=0
In a performance issue I am looking at in my out-of-tree target, the partially
vectorized loop is faster than the scalar loop, but the check on numOutputs
makes the code always run the scalar loop. The vector code looks better when
numOutputs == 1, but it is worse in practice since the predicate is rarely
fulfilled.
I wonder if what LV does here makes sense in general? Is it a good idea to add
predicates like this and have the more general case only run the scalar version
of the loop?
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210225/cb5c24cd/attachment-0001.html>
More information about the llvm-bugs
mailing list