[PATCH] D9804: Optimize scattered vector insert/extract pattern
Ana Pazos via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 25 16:13:04 PDT 2015
apazos added a subscriber: apazos.
apazos added a comment.
Hi folks,
Just want to clarify where this issue comes from:
1. SROA will replace large allocas with vector SSA values.
E.g., alloca "short a [32]" is rewritten as 4 vectors of type <8 x i16> to avoid the load/stores to the stack-allocated variable.
This results in insert/extract instructions being generated in the IR code.
2. The AArch64 backend is not able to combine scattered loads and stores with the insert/extract instructions to generate scalar/lane-based loads/stores in the presence of extension instructions.
Example 1: When there no extension/truncation of the loaded values we are fine, the backend generates optimized code.
x = ld
y = insert x v1, 1
Generates:
ld1 { v0.b }[1], [x0]
Example 2: But when extension instructions are present:
x = ld
y = ext x
z = insert y v1, 1
Generates:
ldrb w8, [x0]
ins v0.h[1], w8
However this is better code:
ld1 { v0.b }[1], [x0]
ushll v0.8h, v0.8b, #0
You notice it is better code when you have more than one insert instruction:
ldrb w8, [x0]
ldrb w9, [x1]
ins v0.h[1], w8
ins v0.h[5], w9
Better code would be:
ld1 { v0.b }[1], [x0]
ld1 { v0.b }[5], [x1]
ushll v0.8h, v0.8b, #0
The same is true for extract instructions:
umov w8, v0.b[1]
umov w9, v0.b[5]
strh w8, [x0]
strh w9, [x1]
Better code would be:
ushll v0.8h, v0.8b, #0
st1 { v0.h }[1], [x0]
st1 { v0.h }[5], [x1]
Therefore after SROA we need to detect these patterns in the IR and fix the IR code so the backend can generate the optimized instructions.
This should be done target-independent. Maybe it can be done in Inst Combine, or SLP vectorizer (as in this patch).
Even though it is SROA who is generating the insert/extract instructions, I do not think we should fix it there.
This is the problem Lawrence is trying to solve. Any other suggestion?
Repository:
rL LLVM
http://reviews.llvm.org/D9804
More information about the llvm-commits
mailing list