[PATCH] D120571: [CGP, AArch64] Replace zexts with shuffle that can be lowered using tbl.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 12 14:10:09 PDT 2023


fhahn added a comment.



In D120571#4332504 <https://reviews.llvm.org/D120571#4332504>, @efriedma wrote:

> Regression reported at https://github.com/llvm/llvm-project/issues/62620 .  I think the issue is that this transform isn't aware of widening instructions; if the <16 x i32> output is used in a way that can be optimized to use 16-bit inputs, the six ushll/ushll2 instructions are actually lowered to just two ushll/ushll2 instructions, so transforming that into for tbl isn't profitable.



In D120571#4335168 <https://reviews.llvm.org/D120571#4335168>, @dmgreen wrote:

> There are quite a few places where this transform is not profitable due to it blocking fold in selection-dag. They would be more obvious, but we don't have many tests with loops in the backend. I have been looking lately about whether it makes sense to replace it with something that happens either during or after ISel. I think after ISel should work as a (larger) peephole optimization, which should then work with both SDAG and GISel and prevent us needing to try and handle it in both places. We just need to recognize the patterns of USHLL's and that way we only optimize if it turns out to really be useful.

Yeah I saw the report, thanks! The current logic only tries to introduce `tbl` if there are at least 2 casting steps required, but doesn't know about widening instructions, so misses cases where only one step will be needed.  I think we should be able to catch (hopefully) most cases using the existing logic in TTI: D150482 <https://reviews.llvm.org/D150482>


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120571/new/

https://reviews.llvm.org/D120571



More information about the llvm-commits mailing list