[all-commits] [llvm/llvm-project] ae6dbe: [AMDGPU] Use correct DWord for v_dot4 S0 operand ...

Jeffrey Byrnes via All-commits all-commits at lists.llvm.org
Wed Nov 6 20:48:42 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: ae6dbed5943d76c61fe95107c15a46f915180772
      https://github.com/llvm/llvm-project/commit/ae6dbed5943d76c61fe95107c15a46f915180772
  Author: Jeffrey Byrnes <jeffrey.byrnes at amd.com>
  Date:   2024-11-06 (Wed, 06 Nov 2024)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/test/CodeGen/AMDGPU/idot4s.ll

  Log Message:
  -----------
  [AMDGPU] Use correct DWord for v_dot4 S0 operand  (#115224)

Fixes a copy-paste typo.

The typo resulted in producing bad v_perm based operands for the v_dot4
combine. When adding a corresponding byte pair to the v_dot byte pair
chains, we must take note of the byte position in the corresponding
source nodes. These byte positions are used to ensure we extract the
correct DWord from the ultimate source, and formulate a correct
perm_mask from the extracted DWord.

With the typo, we the S0 byte would used the DWord offset for the
corresponding S1 byte. If this offset was not the same as the true DWord
offset for the S0 byte, we would extract and use the wrong byte for S0
in the v_dot.

Fixes https://github.com/llvm/llvm-project/issues/112941



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list