[PATCH] D136319: [GISel] Rework trunc/shl combine in a generic trunc/shift combine

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 17 02:50:55 PST 2022


Pierre-vh added a comment.

In D136319#3931063 <https://reviews.llvm.org/D136319#3931063>, @arsenm wrote:

> Not sure why you deleted the test

Test cases are in the new file, not sure why it shows as deleted



================
Comment at: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:2313-2316
+    // TODO: Fix truncstore combine to handle (trunc(lshr (trunc x), k)).
+    for (auto &User : MRI.use_instructions(DstReg))
+      if (User.getOpcode() == TargetOpcode::G_STORE)
+        return false;
----------------
arsenm wrote:
> arsenm wrote:
> > I don't understand special casing this, I'd rather just ignore it
> Do this first then?
I spent a while trying to get the truncstore combine to work properly in such cases but it's really not easy. The matching logic is a bit complex.

There's also a comment in that combine's matcher indicating it may be moved to a separate pass in the future to improve its matching logic, so I think it's better to just leave this special case in for now since:
  - Fixing the truncstore combine seems to be hard
  - This seems to be a niche case - this special case doesn't look like it causes any harm, on the contrary, it allows the combine to not interfere with truncstore.

The detailed reason is that the truncstore combine calculates the number of stores it should be merging together based on the type of the source it matches.
 - If we always ignore the trunc, it breaks the combine on AArch64 for 16 bit sources, as the source is always going to be a trunc of a `w` register. (See `trunc_i16_to_i8` testcase).
   - It would always take the 32 bit value as the source, but in those cases the 16 bit source is the correct one.
 - Conditionally ignoring the trunc (e.g. only when the src was matched from a shift) can cause premature matching of the combine.
   - e.g. if this (right-shift trunc) combine kicks in on `trunc_i64_to_i16`, it causes the shift's source to be 32 bits. We then see that we need 2 or 4 stores (depending on which source we use). As the combiner works top-down within a block, we'll find 2 stores before finding 4 of them, and the matcher has to conservatively assume it won't find more. It'll end up merging 2 stores instead of 4.

TL;DR: I would rather leave the special case in for now as removing isn't worth the effort IMO (and it doesn't seem to improve codegen quality significantly either)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136319/new/

https://reviews.llvm.org/D136319



More information about the llvm-commits mailing list