[llvm] [AMDGPU] expand-fp: unify scalarization (NFC) (PR #158588)

Frederik Harwath via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 15 09:02:35 PDT 2025


================
@@ -1004,55 +1002,37 @@ static bool runImpl(Function &F, const TargetLowering &TLI,
     return false;
 
   for (auto &I : instructions(F)) {
-    switch (I.getOpcode()) {
-    case Instruction::FRem: {
-      Type *Ty = I.getType();
-      // TODO: This pass doesn't handle scalable vectors.
-      if (Ty->isScalableTy())
-        continue;
-
-      if (targetSupportsFrem(TLI, Ty) ||
-          !FRemExpander::canExpandType(Ty->getScalarType()))
-        continue;
-
-      Replace.push_back(&I);
-      Modified = true;
+    Type *Ty = I.getType();
+    // TODO: This pass doesn't handle scalable vectors.
+    if (Ty->isScalableTy())
+      continue;
 
+    switch (I.getOpcode()) {
+    case Instruction::FRem:
+      if (!targetSupportsFrem(TLI, Ty) &&
+          FRemExpander::canExpandType(Ty->getScalarType())) {
+        enqueueInstruction(I, Replace, ReplaceVector);
----------------
frederik-h wrote:

> > That's something that I also didn't really understand. I can try to get rid of it and see what breaks, if anything.
> 
> It's just because of the way the pass iterates the instructions, i.e. the IR changes invalidate the iterator. Both queues can be removed easily. I'd prefer to do this in a follow-up PR immediately after this one.

Sorry, I changed my mind. I have now removed the `ReplaceVector` queue as a part of this PR, but I think we should keep the `Replace` queue because that's the easiest way to deal with the IR changes in the expansion.

https://github.com/llvm/llvm-project/pull/158588


More information about the llvm-commits mailing list