[PATCH] D87939: [PeepholeOptimizer] Enhance the redundant COPY elimination.

Fri Sep 18 14:02:15 PDT 2020

dmgreen added inline comments.

================
Comment at: llvm/test/CodeGen/Thumb2/mve-vcvt16.ll:21-29
+; CHECK-NEXT:    vcvtt.f32.f16 s11, s1
+; CHECK-NEXT:    vcvtt.f32.f16 s7, s3
+; CHECK-NEXT:    vcvtb.f32.f16 s10, s1
+; CHECK-NEXT:    vcvtb.f32.f16 s6, s3
+; CHECK-NEXT:    vcvtt.f32.f16 s9, s0
+; CHECK-NEXT:    vcvtt.f32.f16 s5, s2
+; CHECK-NEXT:    vcvtb.f32.f16 s8, s0
----------------
hliao wrote:
> The code sequence is totally different. But, based on my understanding ARM ISA, they are equivalent. The previous one will copy q0 to q2 and convert s8~s11 (alias to q2) into s0~s7 (alias to q0 and q1) as the return value. The new one firstly convert s0~s3 (alias to q0 as the input) to s4~s11 (alias to q1 and q2) followed by moving q2 to q0 to form the return pair of q0 and q1. Please let me know whether they are really equivalent.
Yeah sounds fine.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87939/new/

https://reviews.llvm.org/D87939