[PATCH] D125485: [ArgPromotion] Unify byval promotion with non-byval

Fri Jun 24 04:00:43 PDT 2022

nikic added inline comments.

================
Comment at: llvm/lib/Transforms/IPO/ArgumentPromotion.cpp:608
+    const Use *U = Worklist.pop_back_val();
+    if (isa<BitCastInst>(U->getUser())) {
+      AppendUses(U->getUser());
----------------
Put `U->getUser()` into a variable, as it's used so much?

================
Comment at: llvm/lib/Transforms/IPO/ArgumentPromotion.cpp:629
+    auto *SI = dyn_cast<StoreInst>(U->getUser());
+    if (AreStoresAllowed && SI && SI->getValueOperand() != *U) {
+      if (Optional<bool> Res =
----------------
I think this works, but the robust way to check this is `U->getOperandNo() == SI->getPointerOperandIndex()`. (We will separately visit the use in the pointer and value operands, and only consider the pointer operand a known use, while the value operand will fall through to the bailout below.)

================
Comment at: llvm/lib/Transforms/IPO/ArgumentPromotion.cpp:630
+    if (AreStoresAllowed && SI && SI->getValueOperand() != *U) {
+      if (Optional<bool> Res =
+              HandleEndUser(SI, SI->getValueOperand()->getType(),
----------------
And with the previous change, I believe you can change this to `if (!*HandleEndUser` like for LoadInst, as we show now be guaranteed that this is store based-on the argument.

================
Comment at: llvm/lib/Transforms/IPO/ArgumentPromotion.cpp:443-444
+    // succeed.
+    DominatorTree DT(*NF);
+    PromoteMemToReg(Allocas, DT, &AC);
   }
----------------
psamolysov wrote:
> psamolysov wrote:
> > aeubanks wrote:
> > > creating the DominatorTree should go through something like
> > > `function_ref<DominatorTree &(Function &F)> DTGetter`, see `AARGetter`.
> > > 
> > > but we run SROA (including mem2reg) right after argpromotion, is there a reason to run mem2reg here rather than leave it for the function simplification pipeline? then we wouldn't need to worry about AssumptionCache/DominatorTree here since they're purely used for mem2reg
> > About running `mem2reg` in the pass. I tried to remove the `mem2reg` and run a few tests with `opt -O3` (and adding the `noinline` attribute for the callee before), and see that the almost of the code of the callees is optimized out, so the idea just to remove the `mem2reg` call from the pass looks suitable. 
> > 
> > @nikic what is your opinion? Should we call the `mem2reg` inside the argument promotion pass or let the compilation pipeline call it?
> > 
> > P.S. I'm also planning to add the `Dead Argument Elimination` pass into the pipeline after landing this patch because `mem2reg` after new argument promotion leaves unused arguments.
> @aeubanks Thank you for the suggestion about `DTGetter`. 
>  Unfortunately, I cannot get why the `DTGetter` should be used. As I get, `AARGetter` is a wrapper over the `FAM.getResult<AAManager>(F)` for the new PM and is just an instance of the `LegacyAARGetter` class for the legacy PM. It's goal is to give the correct AAR depending on the used pass manager.
> 
> In our case, the `DominatorTree` is just built again for a new function after the argument promotion and doesn't depend on the used pass manager.
I'm okay with not doing the promotion here. We should probably add `mem2reg` to the tests though, to make sure we generate trivially promotable code.

================
Comment at: llvm/lib/Transforms/IPO/ArgumentPromotion.cpp:635
+              HandleEndUser(SI, SI->getValueOperand()->getType(),
+                            /* GuaranteedToExecute */ true)) {
+        if (!*Res)
----------------
psamolysov wrote:
> nikic wrote:
> > psamolysov wrote:
> > > I'm not aware about speculative stores. Should we handle the stores in the exactly same way as loads here, so should the parameter be set to `false`?
> > I think the parameter should be false here, because the store is not guaranteed to executed, and it would be confusing otherwise.
> > 
> > The relevant part here is that we're not going to speculate any stores, so we don't care about the alignment at all -- we can explicitly skip the alignment code in HandleEndUser for stores to make this clear.
>  Thank you for the suggestion. I've changed the argument's value to `true` and fix the alignment check in the `HandleEndUser` in order to let it work for loads only. Also, I've edited the comment before the check to make it clear that only loads are checked.
> 
> Two questions, please. Currently `Part` can be either a load or a store previously saved in the `ArgParts` map by the same offset. Should the condition that the `MustExecInstr` in the seen before `ArgPart` is a load (or at least no a store) also be added? Also, when the `Part.Alignment` field is recalculated - `Part.Alignment = std::max(Part.Alignment, I->getAlign());`, should we do this whenever `I` is a load only or in any case?
> 
> Thank you.
Hm, so I guess it's not quite true that we don't speculate stores, in the sense that we do insert a store to store the argument into the alloca. Of course, as this store gets promoted later, the actual alignment doesn't really matter. It's probably still best to produce correct alignment for it.

So I think the safest thing to do here is just not special case the store case at all -- or would that cause any regressions in tests?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125485/new/

https://reviews.llvm.org/D125485