[PATCH] D135869: [AMDGPU][DAG] Only apply trunc/shift combine to 16 bit types
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 14 00:49:12 PDT 2022
Pierre-vh added a comment.
This is trying to solve a miscompilation, but it might be the wrong fix. I know for sure this combine is responsible for a miscompilation in a sample, as disabling it/removing it/applying this fix resolves the issue.
This is the DAG being wrongly combined:
t26: i64 = srl t23, Constant:i32<26>
t28: i64 = mul nuw nsw t22, Constant:i64<12345678>
t29: i64 = add nuw nsw t26, t28
t32: i64 = srl t29, Constant:i32<25>
t33: i26 = truncate t32
Combining: t33: i26 = truncate t32
Creating new node: t82: i32 = truncate t29
Creating new node: t83: i32 = srl t82, Constant:i32<25>
Creating new node: t84: i26 = truncate t83
... into: t84: i26 = truncate t83
srl is a shift-right, so it doesn't seem right to me that we truncate the source to 32 bits as it changes the output.
Maybe the error is in the VT of the first trunc? Shouldn't we have at least 2*K for shift-right?
Maybe this is a better fix? This also works to fix the miscompilation I'm seeing.
EVT MidScalarTy = MVT::i32;
// For right shifts, ensure the VT of the shift source is wide
// enough that we don't lose bits in the result.
if(Src.getOpcode() == ISD::SRL || Src.getOpcode() == ISD::SRA) {
// Don't risk losing info if we don't know the shift amount.
if(!Known.isConstant())
return SDValue();
const uint64_t ScalarWidth = Known.getConstant().getZExtValue() * 2;
if(ScalarWidth >= 64)
return SDValue();
MidScalarTy = EVT::getIntegerVT(*DAG.getContext(), ScalarWidth);
}
EVT MidVT = VT.isVector() ?
EVT::getVectorVT(*DAG.getContext(), MidScalarTy,
VT.getVectorNumElements()) : MidScalarTy;
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D135869/new/
https://reviews.llvm.org/D135869
More information about the llvm-commits
mailing list