[PATCH] D87939: [PeepholeOptimizer] Enhance the redundant COPY elimination.

Michael Liao via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 18 13:34:59 PDT 2020


hliao added inline comments.


================
Comment at: llvm/test/CodeGen/Thumb2/mve-vcvt16.ll:21-29
+; CHECK-NEXT:    vcvtt.f32.f16 s11, s1
+; CHECK-NEXT:    vcvtt.f32.f16 s7, s3
+; CHECK-NEXT:    vcvtb.f32.f16 s10, s1
+; CHECK-NEXT:    vcvtb.f32.f16 s6, s3
+; CHECK-NEXT:    vcvtt.f32.f16 s9, s0
+; CHECK-NEXT:    vcvtt.f32.f16 s5, s2
+; CHECK-NEXT:    vcvtb.f32.f16 s8, s0
----------------
The code sequence is totally different. But, based on my understanding ARM ISA, they are equivalent. The previous one will copy q0 to q2 and convert s8~s11 (alias to q2) into s0~s7 (alias to q0 and q1) as the return value. The new one firstly convert s0~s3 (alias to q0 as the input) to s4~s11 (alias to q1 and q2) followed by moving q2 to q0 to form the return pair of q0 and q1. Please let me know whether they are really equivalent.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87939/new/

https://reviews.llvm.org/D87939



More information about the llvm-commits mailing list