[llvm] [AArch64] use `isTRNMask` to calculate shuffle costs (PR #171524)

via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 12 09:26:17 PST 2025


github-actions[bot] wrote:

<!--PREMERGE ADVISOR COMMENT: Windows-->
# :window: Windows x64 Test Results

* 128608 tests passed
* 2810 tests skipped
* 1 test failed

## Failed Tests
(click on a test name to see its output)

### LLVM
<details>
<summary>LLVM.CodeGen/AMDGPU/GlobalISel/buffer-load-byte-short.ll</summary>

```
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
c:\_work\llvm-project\llvm-project\build\bin\llc.exe -mtriple=amdgcn-amd-amdhsa -global-isel -new-reg-bank-select -mcpu=gfx1200 < C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\AMDGPU\GlobalISel\buffer-load-byte-short.ll | c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe --check-prefix=GFX12 C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\AMDGPU\GlobalISel\buffer-load-byte-short.ll
# executed command: 'c:\_work\llvm-project\llvm-project\build\bin\llc.exe' -mtriple=amdgcn-amd-amdhsa -global-isel -new-reg-bank-select -mcpu=gfx1200
# note: command had no output on stdout or stderr
# executed command: 'c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe' --check-prefix=GFX12 'C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\AMDGPU\GlobalISel\buffer-load-byte-short.ll'
# .---command stderr------------
# | C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\AMDGPU\GlobalISel\buffer-load-byte-short.ll:160:15: error: GFX12-NEXT: expected string not found in input
# | ; GFX12-NEXT: s_wait_alu 0xf1ff
# |               ^
# | <stdin>:757:28: note: scanning from here
# |  v_readfirstlane_b32 s7, v3
# |                            ^
# | <stdin>:758:2: note: possible intended match here
# |  s_wait_alu depctr_va_sdst(0)
# |  ^
# | C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\AMDGPU\GlobalISel\buffer-load-byte-short.ll:192:15: error: GFX12-NEXT: expected string not found in input
# | ; GFX12-NEXT: s_wait_alu 0xf1ff
# |               ^
# | <stdin>:864:23: note: scanning from here
# |  s_mov_b32 s7, exec_lo
# |                       ^
# | <stdin>:865:2: note: possible intended match here
# |  s_wait_alu depctr_va_sdst(0)
# |  ^
# | C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\AMDGPU\GlobalISel\buffer-load-byte-short.ll:223:15: error: GFX12-NEXT: expected string not found in input
# | ; GFX12-NEXT: s_wait_alu 0xf1ff
# |               ^
# | <stdin>:970:28: note: scanning from here
# |  v_readfirstlane_b32 s9, v5
# |                            ^
# | <stdin>:971:2: note: possible intended match here
# |  s_wait_alu depctr_va_sdst(0)
# |  ^
# | 
# | Input file: <stdin>
# | Check file: C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\AMDGPU\GlobalISel\buffer-load-byte-short.ll
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             .
# |             .
# |             .
# |           752: .LBB8_1: ; =>This Inner Loop Header: Depth=1 
# |           753:  v_readfirstlane_b32 s4, v0 
# |           754:  s_wait_loadcnt 0x0 
# |           755:  v_readfirstlane_b32 s5, v1 
# |           756:  v_readfirstlane_b32 s6, v2 
# |           757:  v_readfirstlane_b32 s7, v3 
# | next:160'0                                X error: no match found
# |           758:  s_wait_alu depctr_va_sdst(0) 
# | next:160'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | next:160'1      ?                             possible intended match
# |           759:  s_delay_alu instid0(VALU_DEP_3) | instskip(NEXT) | instid1(VALU_DEP_2) 
# | next:160'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           760:  v_cmp_eq_u64_e32 vcc_lo, s[4:5], v[0:1] 
# | next:160'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           761:  v_cmp_eq_u64_e64 s1, s[6:7], v[2:3] 
# | next:160'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           762:  s_and_b32 s1, vcc_lo, s1 
# | next:160'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           763:  s_delay_alu instid0(SALU_CYCLE_1) 
# | next:160'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |             .
# |             .
# |             .
# |           859: test_buffer_load_i8_waterfall_soffset: ; @test_buffer_load_i8_waterfall_soffset 
# | next:160'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           860: ; %bb.0: 
# |           861:  s_mov_b32 s6, exec_lo 
# |           862: .LBB9_1: ; =>This Inner Loop Header: Depth=1 
# |           863:  v_readfirstlane_b32 s8, v1 
# |           864:  s_mov_b32 s7, exec_lo 
# | next:192'0                           X error: no match found
# |           865:  s_wait_alu depctr_va_sdst(0) 
# | next:192'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | next:192'1      ?                             possible intended match
# |           866:  v_cmpx_eq_u32_e64 s8, v1 
# | next:192'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           867:  s_wait_loadcnt 0x0 
# | next:192'0     ~~~~~~~~~~~~~~~~~~~~
# |           868:  buffer_load_i8 v2, v0, s[0:3], s8 offen 
# | next:192'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           869:  ; implicit-def: $vgpr1 
# | next:192'0     ~~~~~~~~~~~~~~~~~~~~~~~~
# |           870:  ; implicit-def: $vgpr0 
# | next:192'0     ~~~~~~~~~~~~~~~~~~~~~~~~
# |             .
# |             .
# |             .
# |           965:  v_readfirstlane_b32 s4, v0 
# |           966:  s_wait_loadcnt 0x0 
# |           967:  v_readfirstlane_b32 s5, v1 
# |           968:  v_readfirstlane_b32 s6, v2 
# |           969:  v_readfirstlane_b32 s7, v3 
# |           970:  v_readfirstlane_b32 s9, v5 
# | next:223'0                                X error: no match found
# |           971:  s_wait_alu depctr_va_sdst(0) 
# | next:223'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | next:223'1      ?                             possible intended match
# |           972:  v_cmp_eq_u64_e32 vcc_lo, s[4:5], v[0:1] 
# | next:223'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           973:  s_delay_alu instid0(VALU_DEP_3) | instskip(NEXT) | instid1(VALU_DEP_3) 
# | next:223'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           974:  v_cmp_eq_u64_e64 s2, s[6:7], v[2:3] 
# | next:223'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           975:  v_cmp_eq_u32_e64 s3, s9, v5 
# | next:223'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           976:  s_and_b32 s2, vcc_lo, s2 
# | next:223'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
# |             .
# |             .
# |             .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

```
</details>

If these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the `infrastructure` label.

https://github.com/llvm/llvm-project/pull/171524


More information about the llvm-commits mailing list