[llvm] [AMDGPU] Optimize rotate instruction selection patterns (PR #143551)

Aleksandar Spasojevic via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 21 08:05:41 PDT 2025


aleksandar-amd wrote:

Analysis:

    **First Approach:**
        Make ROTR legal and allow FSHR to be legal as well. ROTL and FSHL are not legal and will be legalized into ROTR and FSHR, respectively. The expansion of ROTR and FSHR into (SHR, SHL, OR pattern) occurs in the instruction selection pass based on divergence information. This approach faced issues because the expansion occurs after combining passes, preventing optimizations like constant folding on the expanded code.

    **Second Approach:**
        Ideally, we could selectively expand these operations in the Legalizer based on divergence information, but this is not possible because GlobalISel lacks divergence information in the Legalizer. By making FSHR and ROTR not legal, these operations are expanded unconditionally. However, this leads to issues in the SDAG combiner. In visitOR, the MatchRotate method selects (SHL, SHR, OR patterns into ROTR/FSHR) only when some of these operations are legal. Since none are legal in this case, pattern combining cannot occur regularly without deviating from LLVM compiler coding paradigms.

    **Third Approach:**
        Let ROTR be illegal, and FSHR be legal. In this scenario, ROTR is expanded in the legalizer pass, and later in the SDAG Combiner with a hook function for divergence checks for the AMDGPU target, it can be selectively combined. However, this leaves FSHR unexpanded in the legalizer. One approach was to create a custom legalizer for FSHR, making it selectively legal based on divergence information for SDAG, but this is impractical as operations should be either legal or illegal, adhering to LLVM coding paradigms. Another attempt was to expand FSHR into (SHL, SHR, OR pattern) in the combiner pass, but this contradicts the combiner's purpose, as it is meant to combine, not expand.

    **Fourth Approach:**
        If the SDAG solution involves expanding in the combiner pass, the next step in GlobalISel was to leave ROTR and FSHR legal and perform the expansion in the new-regbank-select pass, where divergence information is introduced, allowing for selective expansion.

    **Fifth Approach:**
        Make ROTR and FSHR illegal and allow expansion in the legalizer. Write a comprehensive pattern to catch (SHR, SHL, OR) patterns from expanded ROTR, ROTL, FSHR, and FSHL in the instruction selection pass. The pattern is too complex for TableGen, so it is written in .cpp code for both SDAG and GlobalISel. Since FSHR and ROTR are illegal, MIR tests including these operations are deleted. The pattern is extensive as it captures cases for both FSHR and ROTR expanded patterns and checks cases when the combiner optimized the pattern with constant literals. A separate PR was made for this approach. Currently, this pattern does not work for ROTL and FSHL expanded patterns. I need suggestions on whether to proceed with this approach, complete the pattern, and refactor the code.
       
       Link for PR for fitth approach is:
       [](https://github.com/llvm/llvm-project/pull/149817)


https://github.com/llvm/llvm-project/pull/143551


More information about the llvm-commits mailing list