[PATCH] D147235: [AArch64] Remove redundant `mov 0` instruction for high 64-bits

Thu Mar 30 08:44:08 PDT 2023

dmgreen added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp:618-619
+    break;
+  case AArch64::FCVTNv4i16:
+  case AArch64::SHRNv8i8_shift:
+    isSetZeroHigh64bits = true;
----------------
Can we extend this to all the instructions that are like FCVTNv4i16/SHRNv8i8_shift? For example maybe these, which I think produce 64bit results and are similar to the instructions you already have:
```
RSHRNv2i32_shift
RSHRNv4i16_shift
RSHRNv8i8_shift 
SHRNv2i32_shift
SHRNv4i16_shift
SHRNv8i8_shift 
FCVTNv2i32
FCVTNv4i16
```

We might be able to get away with "Any instruction that defs a FPR64", but that might need more careful checking and there are quite a few of them. We should probably try and get these classes of instruction though, not just the exact sizes.

================
Comment at: llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp:620
+  case AArch64::SHRNv8i8_shift:
+    isSetZeroHigh64bits = true;
+    break;
----------------
If this return's true directly, then isSetZeroHigh64bits won't be needed and more.

================
Comment at: llvm/test/CodeGen/AArch64/implicitly-set-zero-high-64-bits.ll:7
+
+define nofpclass(nan inf) <8 x half> @test1(<4 x float> noundef nofpclass(nan inf) %a) {
+; CHECK-LABEL: test1:
----------------
We can probably remove all the `nofpclass(nan inf)` stuff

================
Comment at: llvm/test/CodeGen/AArch64/implicitly-set-zero-high-64-bits.ll:19
+
+define <8 x i8> @test2(ptr nocapture noundef readonly %in, ptr nocapture noundef readnone %dst, <8 x i8> noundef %idx) {
+; CHECK-LABEL: test2:
----------------
dst doesn't seem to be used.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147235/new/

https://reviews.llvm.org/D147235