[PATCH] D107502: [WebAssembly] Legalize vector types by widening

Thu Aug 5 03:16:08 PDT 2021

aheejin added a comment.

Looks very nice! It looks it will have a significant impact on the performance. Are there possibly code patterns that will suffer from this change? If not, I wonder why was the default option in `TargetLowering` set up that way.. Maybe other architectures have more benefits when doing integer promotion? Anyway I feel I don't fully understand things yet so I added some questions.

And why were the two tests deleted?

================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.h:121-123
+    // Boolean values should be promoted to full lanes.
+    if (VT.getVectorElementType() == MVT::i1)
+      return TypePromoteInteger;
----------------
Why is the exception?

================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td:199-205
+def load_splat_scalar :
+  PatFrag<(ops node:$addr), (scalar_to_vector (i64 (load $addr)))>;
+defm : LoadPatNoOffset<v2i64, load_splat_scalar, "LOAD64_SPLAT">;
+defm : LoadPatImmOff<v2i64, load_splat_scalar, regPlusImm, "LOAD64_SPLAT">;
+defm : LoadPatImmOff<v2i64, load_splat_scalar, or_is_add, "LOAD64_SPLAT">;
+defm : LoadPatOffsetOnly<v2i64, load_splat_scalar, "LOAD64_SPLAT">;
+defm : LoadPatGlobalAddrOffOnly<v2i64, load_splat_scalar, "LOAD64_SPLAT">;
----------------
- If what we want is to insert the element in the index 0 of the new vector, why is the splat necessary?
- Why is only `load64_splat` necessary? What happens when we widen i8x4 into i32x4? In this case do we need `load32_splat`? Is widening only happens by a factor of 2?

================
Comment at: llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll:297
 ; CHECK-NEXT:    local.get 0
-; CHECK-NEXT:    i16x8.load8x8_u 0:p2align=0
+; CHECK-NEXT:    v128.load64_splat 0:p2align=0
 ; CHECK-NEXT:    # fallthrough-return
----------------
It looks this actually changes the result being returned compared to the previously generate code. Is that fine?

================
Comment at: llvm/test/CodeGen/WebAssembly/simd-offset.ll:926
 ; CHECK-NEXT:    local.get 0
-; CHECK-NEXT:    v128.and
-; CHECK-NEXT:    local.get 0
-; CHECK-NEXT:    i8x16.narrow_i16x8_u
-; CHECK-NEXT:    i64x2.extract_lane 0
-; CHECK-NEXT:    i64.store 0
+; CHECK-NEXT:    v128.store64_lane 0, 0
 ; CHECK-NEXT:    # fallthrough-return
----------------
Now that we are doing this, are the narrowing store patterns added in D84377 necessary?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107502/new/

https://reviews.llvm.org/D107502