[llvm] [CostModel][X86] Improve cost estimation of insert_subvector shuffle patterns of legalized types (PR #119363)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 10 06:17:44 PST 2024
================
@@ -18,24 +18,24 @@
define void @test_vXf64(<2 x double> %a128, <4 x double> %a256, <8 x double> %a512, <2 x double> %b128, <4 x double> %b256, <8 x double> %b512) {
; SSE-LABEL: 'test_vXf64'
-; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V256_128 = shufflevector <2 x double> %a128, <2 x double> %b128, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V256_128 = shufflevector <2 x double> %a128, <2 x double> %b128, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
; SSE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V512_128 = shufflevector <2 x double> %a128, <2 x double> %b128, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
----------------
RKSimon wrote:
Yes eventually processShuffleMasks will be able to handle it - if you look at the shuffle mask it concats both v2f64 inputs TWICE into a v8f64, so for now it gets treated as a general SK_PermuteTwoSrc. When I get processShuffleMasks support updated it will split the v8f64 into smaller legal types and see the sub-shuffles are free on SSE (and cheap on AVX).
https://github.com/llvm/llvm-project/pull/119363
More information about the llvm-commits
mailing list