[PATCH] D139871: [WebAssembly] Replace LOAD_SPLAT with SPLAT_VECTOR

Tue Dec 13 06:42:42 PST 2022

luke marked an inline comment as done.
luke added a comment.

================
Comment at: llvm/test/CodeGen/WebAssembly/fpclamptosat_vec.ll:302
 ; CHECK-NEXT:    i32x4.min_s
-; CHECK-NEXT:    v128.const -32768, -32768, 0, 0
+; CHECK-NEXT:    v128.const -32768, -32768, -32768, -32768
 ; CHECK-NEXT:    i32x4.max_s
----------------
This is as close as I could get to making it an NFC, by adding a pattern to select `v128.const` for splats on immediates.
The undef fields are no longer 0, but I initially tried to preserve it by only conditionally selecting `splat_vector`. But this meant changing DAGCombine and adding a target lowering info hook, and caused some other cases to fail.

================
Comment at: llvm/test/CodeGen/WebAssembly/simd-load-splat.ll:8-14
 ; CHECK-LABEL: load_splat:
 ; CHECK-NEXT: .functype load_splat (i32, i32) -> (i32)
-; CHECK-NEXT: i32.load8_u $[[E:[0-9]+]]=, 0($0){{$}}
-; CHECK-NEXT: v128.load8_splat $push[[V:[0-9]+]]=, 0($0){{$}}
+; CHECK-NEXT: i32.load8_u $push[[E:[0-9]+]]=, 0($0){{$}}
+; CHECK-NEXT: local.tee $push[[T:[0-9]+]]=, $[[R:[0-9]+]]=, $pop[[E]]{{$}}
+; CHECK-NEXT: i8x16.splat $push[[V:[0-9]+]]=, $pop[[T]]{{$}}
 ; CHECK-NEXT: v128.store 0($1), $pop[[V]]{{$}}
+; CHECK-NEXT: return $[[R]]{{$}}
----------------
luke wrote:
> This results in one more instruction, but one less load. The resulting binary code size is probably comparable since tee is usually only two bytes.
This seems to be triggered by some reordering LLVM does with `splat_vector`

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139871/new/

https://reviews.llvm.org/D139871