[PATCH] D147235: [AArch64] Remove redundant `mov 0` instruction for high 64-bits
JinGu Kang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 31 02:15:53 PDT 2023
jaykang10 added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp:618-619
+ break;
+ case AArch64::FCVTNv4i16:
+ case AArch64::SHRNv8i8_shift:
+ isSetZeroHigh64bits = true;
----------------
dmgreen wrote:
> Can we extend this to all the instructions that are like FCVTNv4i16/SHRNv8i8_shift? For example maybe these, which I think produce 64bit results and are similar to the instructions you already have:
> ```
> RSHRNv2i32_shift
> RSHRNv4i16_shift
> RSHRNv8i8_shift
> SHRNv2i32_shift
> SHRNv4i16_shift
> SHRNv8i8_shift
> FCVTNv2i32
> FCVTNv4i16
> ```
>
> We might be able to get away with "Any instruction that defs a FPR64", but that might need more careful checking and there are quite a few of them. We should probably try and get these classes of instruction though, not just the exact sizes.
>Can we extend this to all the instructions that are like FCVTNv4i16/SHRNv8i8_shift? For example maybe these, which I think produce 64bit results and are similar to the instructions you already have:
Yep, they write lower 64-bits and clear high 64-bits so I think we can add them. Let me add them.
>We might be able to get away with "Any instruction that defs a FPR64", but that might need more careful checking and there are quite a few of them. We should probably try and get these classes of instruction though, not just the exact sizes.
Yep, I agree with you. It is worth to try. After committing this patch, let's check it.
================
Comment at: llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp:620
+ case AArch64::SHRNv8i8_shift:
+ isSetZeroHigh64bits = true;
+ break;
----------------
dmgreen wrote:
> If this return's true directly, then isSetZeroHigh64bits won't be needed and more.
Yep, let me remove it.
================
Comment at: llvm/test/CodeGen/AArch64/implicitly-set-zero-high-64-bits.ll:7
+
+define nofpclass(nan inf) <8 x half> @test1(<4 x float> noundef nofpclass(nan inf) %a) {
+; CHECK-LABEL: test1:
----------------
dmgreen wrote:
> We can probably remove all the `nofpclass(nan inf)` stuff
Yep, let me remove it.
================
Comment at: llvm/test/CodeGen/AArch64/implicitly-set-zero-high-64-bits.ll:19
+
+define <8 x i8> @test2(ptr nocapture noundef readonly %in, ptr nocapture noundef readnone %dst, <8 x i8> noundef %idx) {
+; CHECK-LABEL: test2:
----------------
dmgreen wrote:
> dst doesn't seem to be used.
You are right!
Let me remove it.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D147235/new/
https://reviews.llvm.org/D147235
More information about the llvm-commits
mailing list