[llvm] [X86] shouldReduceLoadWidth - don't split loads if ANY uses are a extract+store or a full width legal binop (PR #129695)

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 6 01:01:40 PST 2025


================
@@ -4228,7 +4228,7 @@ define <4 x float> @uitofp_load_4i64_to_4f32(ptr%a) {
 ; AVX1:       # %bb.0:
 ; AVX1-NEXT:    vmovdqa (%rdi), %ymm0
 ; AVX1-NEXT:    vpsrlq $1, %xmm0, %xmm1
-; AVX1-NEXT:    vmovdqa 16(%rdi), %xmm2
+; AVX1-NEXT:    vextractf128 $1, %ymm0, %xmm2
----------------
RKSimon wrote:

Unless we resort to a tuning flag I'm not sure how we can make that kind of decision in the DAG.

What is making this more difficult is shouldReduceLoadWidth doesn't tell WHERE in the original load it wants to extract the sub-component - we don't want to bother loading duplicate xmm/ymm/zmm from the same ptr, but it might be worth it for an offset load (especially if its likely to fold into another instruction). I'll investigate altering the shouldReduceLoadWidth callback to see if this will help.

https://github.com/llvm/llvm-project/pull/129695


More information about the llvm-commits mailing list