[llvm-bugs] [Bug 44655] New: vector load and store instructions (LD4, ST4) slow execution performance
via llvm-bugs
llvm-bugs at lists.llvm.org
Fri Jan 24 16:33:50 PST 2020
https://bugs.llvm.org/show_bug.cgi?id=44655
Bug ID: 44655
Summary: vector load and store instructions (LD4, ST4) slow
execution performance
Product: libraries
Version: 9.0
Hardware: PC
OS: Linux
Status: NEW
Keywords: performance
Severity: enhancement
Priority: P
Component: Backend: AArch64
Assignee: unassignedbugs at nondot.org
Reporter: sbiersdorff at nvidia.com
CC: arnaud.degrandmaison at arm.com,
llvm-bugs at lists.llvm.org, peter.smith at linaro.org,
Ties.Stuij at arm.com
Created attachment 23061
--> https://bugs.llvm.org/attachment.cgi?id=23061&action=edit
LL file snippet
The following generated assembly takes twice as long to execute versus a
version that only load register in pairs (or one-by-one):
1303 │220: ld4 {v2.2d-v5.2d}, [x13], #64
4888 │ ld4 {v16.2d-v19.2d}, [x14]
20143 │ fmla v16.2d, v2.2d, v1.2d
68 │ fmla v17.2d, v3.2d, v1.2d
1071 │ fmla v18.2d, v4.2d, v1.2d
293 │ fmla v19.2d, v5.2d, v1.2d
4524 │ st4 {v16.2d-v19.2d}, [x14], #64
15579 │ subs x15, x15, #0x2
11 │ ↑ b.ne 220
Much better is to load in pair of scalars (even though that results in more
instructions being executed):
487 │234: ldp q2, q3, [x12, #32]
1106 │ ldp q4, q5, [x12], #64
2694 │ ldp q6, q7, [x13, #32]
2898 │ ldp q16, q17, [x13]
3847 │ subs x14, x14, #0x2
5440 │ fmla v6.2d, v2.2d, v1.2d
1689 │ fmla v16.2d, v4.2d, v1.2d
3530 │ fmla v17.2d, v5.2d, v1.2d
1315 │ fmla v7.2d, v3.2d, v1.2d
135 │ stp q6, q7, [x13, #32]
865 │ stp q16, q17, [x13], #64
2649 │ ↑ b.ne 234
This assembly is generated from running a simple DAXPY loop unrolled by a
factor of 4. Attached is a snippet of the ll file.
Two questions, The slow code is only generated when opt is passed '-O2', which
pass could be responsible for vectorizing these loads and stores? Secondly,
what is the rationale for generating LD4/ST4 instructions if they execute so
much slower that there scalar equivalent versions?
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200125/6103732e/attachment.html>
More information about the llvm-bugs
mailing list