[all-commits] [llvm/llvm-project] 4fa8a5: [AMDGPU] Add sanity check that fixes bad shift ope...
Matt Arsenault via All-commits
all-commits at lists.llvm.org
Fri Aug 11 12:26:49 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 4fa8a5487e3b1a4b2ce743b0008a912026aa3524
https://github.com/llvm/llvm-project/commit/4fa8a5487e3b1a4b2ce743b0008a912026aa3524
Author: Konrad Kusiak <konrad.kusiak at codeplay>
Date: 2023-08-11 (Fri, 11 Aug 2023)
Changed paths:
M llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
M llvm/test/CodeGen/AMDGPU/merge-image-load.mir
M llvm/test/CodeGen/AMDGPU/merge-image-sample.mir
Log Message:
-----------
[AMDGPU] Add sanity check that fixes bad shift operation in AMD backend
There is a problem with the
SILoadStoreOptimizer::dmasksCanBeCombined() function that can lead to
UB.
This boolean function decides if two masks can be combined into 1. The
idea here is that the bits which are "on" in one mask, don't overlap
with the "on" bits of the other. Consider an example (10 bits for
simplicity):
Mask 1: 0101101000
Mask 2: 0000000110
Those can be combined into a single mask: 0101101110.
To check if such an operation is possible, the code takes the mask
which is greater and counts how many 0s there are, starting from the
LSB and stopping at the first 1. Then, it shifts 1u by this number and
compares it with the smaller mask. The problem is that when both masks
are 0, the counter will find 32 zeroes in the first mask and will try
to do a shift by 32 positions which leads to UB.
The fix is a simple sanity check, if the bigger mask is 0 or not.
https://reviews.llvm.org/D155051
More information about the All-commits
mailing list