[llvm] [clang-tools-extra] [clang] [InstCombine] Convert or concat to fshl if opposite or concat exists (PR #68502)
via cfe-commits
cfe-commits at lists.llvm.org
Wed Nov 15 16:22:03 PST 2023
HaohaiWen wrote:
> > Thanks for the updated example!
> >
> > To explain what I meant in first comment using this example: We would perform the transform https://alive2.llvm.org/ce/z/nllcB_, which does not depend at all on how `%yx` is constructed, and whether there is any way to form the `fshl` separately. If the `%yx` is appropriately constructed, the `fshl` can be removed (https://alive2.llvm.org/ce/z/B_KOwv, another missing transform).
> >
> > Is this not a viable approach? Is there a concern here that generating both fshl and bitreverse may be non-profitable for targets without bitreverse? Or maybe supporting this makes the matching too expensive?
>
> It's absolutely a feasible solution.
>
> --------------------------------------
>
> Solution1:
> First optimize bitreverse then eliminate redundant fshl: https://alive2.llvm.org/ce/z/g_gWf3
> This requires
> a) First teach collectBitParts to not only search until unknown opcode, but also try to use itself as root.
> b) Teach recognizeBSwapOrBitReverseIdiom to recognize bit pattern [n/2-1, ..., 1, 0, n-1, n-2, .... n/2]. Then insert bitreverse and fshl.
> c) Teach instcombine to remove redundant fshl if opposite concat exists. This requires to scan def-users chains.
>
> Advantage:
> 1). Even if we can't eliminate fshl, we can still optimize a bunch of IR to fshl+bitreverse. Don't know whether its profitable for most targets.
>
> --------------------------------------
> Solution2:
> First optimize or to fshl then optimize bitreverse: https://alive2.llvm.org/ce/z/WbzJVo
> This requires
> a) What we did in this PR. This requires to scan def-users chains.
> b) same as step a) in Solution 1.
>
> Advantage:
> 1). Can optimize more opposite concat pattern to fshl. It's beneficial for targets with cycle rotate instruction (e.g. rol in x86).
> 2). More easily for implementation. Do not requires step b) in Solution1.
>
> --------------------------------------
>
> Both solutions requires to scan def-users chains. I don't think this is an issue.
> Both solutions can handle my cases. Solution2 is easier to implementation. Any concern about this PR?
> I think b) in solution1 can be implemented in the future if we want both advantages of solution1 and 2. InstCombine will always first try to match fshl then bitreverse. Therefore with solution2 and b) of solution1, we don't need to implent c) in solution1 at all.
>
> Ref for bitreverse optimization: https://github.com/llvm/llvm-project/blob/38b34c61e028751b6778493d6185d07a8af1a3b5/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp#L2686
Comments?
https://github.com/llvm/llvm-project/pull/68502
More information about the cfe-commits
mailing list