[llvm] [X86][SelectionDAG] Fix the Gather's base and index by modifying the Scale value (PR #134979)

Mon Apr 21 21:52:25 PDT 2025

================
@@ -4930,49 +4874,22 @@ define <16 x float> @test_gather_structpt_16f32_mask_index_offset(ptr %x, ptr %a
 ; X86-KNL-NEXT:    vptestmd %zmm0, %zmm0, %k1
 ; X86-KNL-NEXT:    movl {{[0-9]+}}(%esp), %eax
 ; X86-KNL-NEXT:    movl {{[0-9]+}}(%esp), %ecx
-; X86-KNL-NEXT:    vpslld $4, (%ecx), %zmm0
-; X86-KNL-NEXT:    vgatherdps 4(%eax,%zmm0), %zmm1 {%k1}
+; X86-KNL-NEXT:    vmovdqu64 (%ecx), %zmm0
----------------
rohitaggarwal007 wrote:

No, TODO was just my placeholder. 
Yeah that is sound. Currently, Not sure how we can simplify more. But (b) should be there. As we are increasing the arithmetic complexity of the address calculation by changing the scale value. One mov and add are introduced in replacement of vpslld $1 by the Vector-legalized selection DAG is doing this optimization.
I will check the instructions cycle cost in KHL SWOG and let me think how to handle this case.

https://github.com/llvm/llvm-project/pull/134979