[llvm] [AMDGPU] expand-fp: unify scalarization (NFC) (PR #158588)
Frederik Harwath via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 15 09:02:35 PDT 2025
================
@@ -1004,55 +1002,37 @@ static bool runImpl(Function &F, const TargetLowering &TLI,
return false;
for (auto &I : instructions(F)) {
- switch (I.getOpcode()) {
- case Instruction::FRem: {
- Type *Ty = I.getType();
- // TODO: This pass doesn't handle scalable vectors.
- if (Ty->isScalableTy())
- continue;
-
- if (targetSupportsFrem(TLI, Ty) ||
- !FRemExpander::canExpandType(Ty->getScalarType()))
- continue;
-
- Replace.push_back(&I);
- Modified = true;
+ Type *Ty = I.getType();
+ // TODO: This pass doesn't handle scalable vectors.
+ if (Ty->isScalableTy())
+ continue;
+ switch (I.getOpcode()) {
+ case Instruction::FRem:
+ if (!targetSupportsFrem(TLI, Ty) &&
+ FRemExpander::canExpandType(Ty->getScalarType())) {
+ enqueueInstruction(I, Replace, ReplaceVector);
----------------
frederik-h wrote:
> > That's something that I also didn't really understand. I can try to get rid of it and see what breaks, if anything.
>
> It's just because of the way the pass iterates the instructions, i.e. the IR changes invalidate the iterator. Both queues can be removed easily. I'd prefer to do this in a follow-up PR immediately after this one.
Sorry, I changed my mind. I have now removed the `ReplaceVector` queue as a part of this PR, but I think we should keep the `Replace` queue because that's the easiest way to deal with the IR changes in the expansion.
https://github.com/llvm/llvm-project/pull/158588
More information about the llvm-commits
mailing list