[PATCH] D108496: [WebAssembly] Lower v2f32 to v2f64 extending loads with promote_low

Sat Aug 21 21:40:16 PDT 2021

aheejin added a comment.

> Previously extra wide v4f32 to v4f64 extending loads would be legalized to v2f32
> to v2f64 extending loads, which would then be scalarized by legalization. (v2f32
> to v2f64 extending loads not produced by legalization were already being emitted
> correctly.)

Why v2f32 to v2f64 extending loads not produced by legalization are currently handled fine but not the one produced by legalization? I guess I lack the knowledge of order of transformation within isel..

> This regresses the addressing modes supported for
> the extloads not produced by legalization, but that's a fine trade off for now.

What is the example of this case?

================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td:1282
+// Lower extending loads to load64_zero + promote_low
+def extloadv2f32 : PatFrag<(ops node:$ptr), (unindexedload node:$ptr)> {
+  let IsLoad = true;
----------------
Any reason for using `unindexedload` instead of `load`?

================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td:1283
+def extloadv2f32 : PatFrag<(ops node:$ptr), (unindexedload node:$ptr)> {
+  let IsLoad = true;
+  let IsAnyExtLoad = true;
----------------
Is this necessary? `unindexedload` already sets `IsLoad` to `true`.

================
Comment at: llvm/test/CodeGen/WebAssembly/simd-load-promote-wide.ll:9-11
+define <4 x double> @load_promote_v2f62(<4 x float>* %p) {
+; CHECK-LABEL: load_promote_v2f62:
+; CHECK:         .functype load_promote_v2f62 (i32, i32) -> ()
----------------

================
Comment at: llvm/test/CodeGen/WebAssembly/simd-load-promote-wide.ll:29
+  ret <4 x double> %v
+}
+
----------------
Can we do a similar optimization for integer vectors, such as extending `<4 x i32>` to `<4 x i64>`, using `v128.load64_zero` followed by `i64x2.extend_low_i32x4_s/u` (and for other integer types too)? Or is it possible to use `v128.load` to load everything at once and use `i64x.2_extend_low_...` and `i64x2.extend_high_...` to extend each part?

I'm just asking for a possibility and of course don't mean we do that in this CL ;)

================
Comment at: llvm/test/CodeGen/WebAssembly/simd-offset.ll:2960-2962
+define <2 x double> @load_promote_v2f62(<2 x float>* %p) {
+; CHECK-LABEL: load_promote_v2f62:
+; CHECK:         .functype load_promote_v2f62 (i32) -> (v128)
----------------

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108496/new/

https://reviews.llvm.org/D108496