[llvm] [AArch64][GlobalISel] Legalize Insert vector element (PR #81453)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 8 07:03:44 PST 2024
================
@@ -111,25 +85,51 @@ entry:
}
define <3 x double> @insert_v3f64_c(<3 x double> %a, double %b, i32 %c) {
-; CHECK-LABEL: insert_v3f64_c:
-; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
-; CHECK-NEXT: // kill: def $d1 killed $d1 def $q1
-; CHECK-NEXT: // kill: def $w0 killed $w0 def $x0
-; CHECK-NEXT: // kill: def $d2 killed $d2 def $q2
-; CHECK-NEXT: mov v0.d[1], v1.d[0]
-; CHECK-NEXT: stp q0, q2, [sp, #-32]!
-; CHECK-NEXT: .cfi_def_cfa_offset 32
-; CHECK-NEXT: mov x8, sp
-; CHECK-NEXT: and x9, x0, #0x3
-; CHECK-NEXT: str d3, [x8, x9, lsl #3]
-; CHECK-NEXT: ldr q0, [sp]
-; CHECK-NEXT: ldr d2, [sp, #16]
-; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $q0
-; CHECK-NEXT: // kill: def $d1 killed $d1 killed $q1
-; CHECK-NEXT: add sp, sp, #32
-; CHECK-NEXT: ret
+; CHECK-SD-LABEL: insert_v3f64_c:
+; CHECK-SD: // %bb.0: // %entry
+; CHECK-SD-NEXT: // kill: def $d0 killed $d0 def $q0
+; CHECK-SD-NEXT: // kill: def $d1 killed $d1 def $q1
+; CHECK-SD-NEXT: // kill: def $w0 killed $w0 def $x0
+; CHECK-SD-NEXT: // kill: def $d2 killed $d2 def $q2
+; CHECK-SD-NEXT: mov v0.d[1], v1.d[0]
+; CHECK-SD-NEXT: stp q0, q2, [sp, #-32]!
+; CHECK-SD-NEXT: .cfi_def_cfa_offset 32
+; CHECK-SD-NEXT: mov x8, sp
+; CHECK-SD-NEXT: and x9, x0, #0x3
+; CHECK-SD-NEXT: str d3, [x8, x9, lsl #3]
+; CHECK-SD-NEXT: ldr q0, [sp]
+; CHECK-SD-NEXT: ldr d2, [sp, #16]
+; CHECK-SD-NEXT: ext v1.16b, v0.16b, v0.16b, #8
+; CHECK-SD-NEXT: // kill: def $d0 killed $d0 killed $q0
+; CHECK-SD-NEXT: // kill: def $d1 killed $d1 killed $q1
+; CHECK-SD-NEXT: add sp, sp, #32
+; CHECK-SD-NEXT: ret
+;
+; CHECK-GI-LABEL: insert_v3f64_c:
+; CHECK-GI: // %bb.0: // %entry
+; CHECK-GI-NEXT: stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
+; CHECK-GI-NEXT: sub x9, sp, #48
+; CHECK-GI-NEXT: mov x29, sp
+; CHECK-GI-NEXT: and sp, x9, #0xffffffffffffffe0
+; CHECK-GI-NEXT: .cfi_def_cfa w29, 16
+; CHECK-GI-NEXT: .cfi_offset w30, -8
+; CHECK-GI-NEXT: .cfi_offset w29, -16
+; CHECK-GI-NEXT: // kill: def $d0 killed $d0 def $q0
+; CHECK-GI-NEXT: // kill: def $d1 killed $d1 def $q1
+; CHECK-GI-NEXT: mov w8, w0
+; CHECK-GI-NEXT: mov x9, sp
+; CHECK-GI-NEXT: // kill: def $d2 killed $d2 def $q2
+; CHECK-GI-NEXT: mov v0.d[1], v1.d[0]
+; CHECK-GI-NEXT: and x8, x8, #0x3
+; CHECK-GI-NEXT: stp q0, q2, [sp]
+; CHECK-GI-NEXT: str d3, [x9, x8, lsl #3]
+; CHECK-GI-NEXT: ldp q0, q2, [sp]
+; CHECK-GI-NEXT: // kill: def $d2 killed $d2 killed $q2
+; CHECK-GI-NEXT: mov d1, v0.d[1]
+; CHECK-GI-NEXT: // kill: def $d0 killed $d0 killed $q0
+; CHECK-GI-NEXT: mov sp, x29
+; CHECK-GI-NEXT: ldp x29, x30, [sp], #16 // 16-byte Folded Reload
----------------
davemgreen wrote:
There are often a lot of little differences between SDAG and what is currently implemented in GISel. I think in this case the alignment is forced to be higher than it needs to be, so the stack gets aligned, due to this code:
```
Align LegalizerHelper::getStackTemporaryAlignment(LLT Ty,
Align MinAlign) const {
// FIXME: We're missing a way to go back from LLT to llvm::Type to query the
// datalayout for the preferred alignment. Also there should be a target hook
// for this to allow targets to reduce the alignment and ignore the
// datalayout. e.g. AMDGPU should always use a 4-byte alignment, regardless of
// the type.
return std::max(Align(PowerOf2Ceil(Ty.getSizeInBytes())), MinAlign);
}
```
https://github.com/llvm/llvm-project/pull/81453
More information about the llvm-commits
mailing list