[PATCH] D123512: [MachineCombiner]: Avoid including transient instructions in latency calculation

Mon Apr 11 06:59:22 PDT 2022

malharJ created this revision.
Herald added a subscriber: hiraditya.
Herald added a project: All.
malharJ requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

The MachineCombiner pattern matches on machine instruction sequences and generates
a new instruction sequence (that hopefully is more efficient).

Currently, latency calculation (in MachineCombiner) involved when finding the
depth of (root of) the new/transformed instruction sequence includes latency
of transient (ie. machine instructions like COPY, etc. that will be removed
later during register allocation).

This seems incorrect as it results in the depth of the new sequence to be
higher (in some cases, like in the affected test files) than the old sequence,
resulting in a longer critical path and the MachineCombiner ends up rejecting
the transform for efficiency reason.

Also, looking at the logic in MachineTraceMetrics::Ensemble::updateDepth()
(which is used to calculate the latency/depth of the old instruction sequence),
it excludes latency of transient instructions from the calculation.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D123512

Files:
  llvm/lib/CodeGen/MachineCombiner.cpp
  llvm/test/CodeGen/AArch64/aarch64-combine-fmul-fsub.mir
  llvm/test/CodeGen/AArch64/neon-mla-mls.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D123512.421913.patch
Type: text/x-patch
Size: 4909 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220411/1735f6af/attachment.bin>