[PATCH] D147235: [AArch64] Remove redundant `mov 0` instruction for high 64-bits

Fri Mar 31 02:15:53 PDT 2023

jaykang10 added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp:618-619
+    break;
+  case AArch64::FCVTNv4i16:
+  case AArch64::SHRNv8i8_shift:
+    isSetZeroHigh64bits = true;
----------------
dmgreen wrote:
> Can we extend this to all the instructions that are like FCVTNv4i16/SHRNv8i8_shift? For example maybe these, which I think produce 64bit results and are similar to the instructions you already have:
> ```
> RSHRNv2i32_shift
> RSHRNv4i16_shift
> RSHRNv8i8_shift 
> SHRNv2i32_shift
> SHRNv4i16_shift
> SHRNv8i8_shift 
> FCVTNv2i32
> FCVTNv4i16
> ```
> 
> We might be able to get away with "Any instruction that defs a FPR64", but that might need more careful checking and there are quite a few of them. We should probably try and get these classes of instruction though, not just the exact sizes.
>Can we extend this to all the instructions that are like FCVTNv4i16/SHRNv8i8_shift? For example maybe these, which I think produce 64bit results and are similar to the instructions you already have:
Yep, they write lower 64-bits and clear high 64-bits so I think we can add them. Let me add them.

>We might be able to get away with "Any instruction that defs a FPR64", but that might need more careful checking and there are quite a few of them. We should probably try and get these classes of instruction though, not just the exact sizes.
Yep, I agree with you. It is worth to try. After committing this patch, let's check it.

================
Comment at: llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp:620
+  case AArch64::SHRNv8i8_shift:
+    isSetZeroHigh64bits = true;
+    break;
----------------
dmgreen wrote:
> If this return's true directly, then isSetZeroHigh64bits won't be needed and more.
Yep, let me remove it.

================
Comment at: llvm/test/CodeGen/AArch64/implicitly-set-zero-high-64-bits.ll:7
+
+define nofpclass(nan inf) <8 x half> @test1(<4 x float> noundef nofpclass(nan inf) %a) {
+; CHECK-LABEL: test1:
----------------
dmgreen wrote:
> We can probably remove all the `nofpclass(nan inf)` stuff
Yep, let me remove it.

================
Comment at: llvm/test/CodeGen/AArch64/implicitly-set-zero-high-64-bits.ll:19
+
+define <8 x i8> @test2(ptr nocapture noundef readonly %in, ptr nocapture noundef readnone %dst, <8 x i8> noundef %idx) {
+; CHECK-LABEL: test2:
----------------
dmgreen wrote:
> dst doesn't seem to be used.
You are right!
Let me remove it.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147235/new/

https://reviews.llvm.org/D147235