[all-commits] [llvm/llvm-project] 4fa8a5: [AMDGPU] Add sanity check that fixes bad shift ope...

Fri Aug 11 12:26:49 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 4fa8a5487e3b1a4b2ce743b0008a912026aa3524
      https://github.com/llvm/llvm-project/commit/4fa8a5487e3b1a4b2ce743b0008a912026aa3524
  Author: Konrad Kusiak <konrad.kusiak at codeplay>
  Date:   2023-08-11 (Fri, 11 Aug 2023)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
    M llvm/test/CodeGen/AMDGPU/merge-image-load.mir
    M llvm/test/CodeGen/AMDGPU/merge-image-sample.mir

  Log Message:
  -----------
  [AMDGPU] Add sanity check that fixes bad shift operation in AMD backend

There is a problem with the
SILoadStoreOptimizer::dmasksCanBeCombined() function that can lead to
UB.

This boolean function decides if two masks can be combined into 1. The
idea here is that the bits which are "on" in one mask, don't overlap
with the "on" bits of the other. Consider an example (10 bits for
simplicity):

Mask 1: 0101101000
Mask 2: 0000000110

Those can be combined into a single mask: 0101101110.

To check if such an operation is possible, the code takes the mask
which is greater and counts how many 0s there are, starting from the
LSB and stopping at the first 1. Then, it shifts 1u by this number and
compares it with the smaller mask. The problem is that when both masks
are 0, the counter will find 32 zeroes in the first mask and will try
to do a shift by 32 positions which leads to UB.

The fix is a simple sanity check, if the bigger mask is 0 or not.

https://reviews.llvm.org/D155051