[PATCH] D125009: [ArgPromotion] Use common alignment for loads in caller

Fri May 6 04:01:53 PDT 2022

nikic added inline comments.

================
Comment at: llvm/lib/Transforms/IPO/ArgumentPromotion.cpp:521
+    Align ArgAlign = Arg->getParamAlign().valueOrOne();
+    Align PartAlign = std::max(LI->getAlign(), ArgAlign);
     auto Pair = ArgParts.try_emplace(
----------------
psamolysov wrote:
> nikic wrote:
> > Isn't this incorrect for non-zero offsets?
> > 
> > Overall, I'm not sure I really understand what you're trying to achieve here. What we're interested here is whether loads are speculatable, which is the case when they are defereferenceable and aligned. This can be either because the load is guaranteed to execute, or because we have known dereferenceability/alignment knowledge. Byval deref/align will be taken into account for the latter check (allCallersPassValidPointerForArgument).
> @nikic Yes, you are absolutely right, this is incorrect for non-zero offsets. I'm going to fix it ASAP.
> 
> Initially, my intent was to copy the byval promotion behavior where the alignment of the byval argument is taken into account because if the argument is aligned by 32, for example, we should not use less alignment for the part with zero offset loading. And because the byval argument is just a copy of the original value from the caller, it's very easy to get it's alignment: just use the value from the corresponding argument definition in the callee.
> 
> But after rethinking, I see that for non-byval pointers, we should take into account not the argument alignment from the callee definition (as I did here) but actual alignment value from the caller in which we generate new load instructions. It can be considered as an error in the frontend of course, but if we have loads in the callee with less alignment for the part with zero offset than the whole structure was allocated in the caller with, and we use this less alignment in the new generated load instructions, it might be a source of errors in the compiled application. What do you think? 
I think you're mixing up two things here: Legality and optimization. I believe the current code fully handles the legality aspect for both byval and non-byval arguments. If I understand right, your proposed change here is related to optimization: If you have an `align 32` argument that is used in an `align 4` load, you want to make the promoted load use `align 32` instead.

Doing that would be fine, but I also think that this is not really the place to do it: We already have transforms that will increase load/store alignment based on known alignment (e.g. from parameter align attributes). For example, InstCombine does this (see getOrEnforceKnownAlignment). I think the alignment handling code here is already tricky enough without also trying to optimize the alignments at the same time.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125009/new/

https://reviews.llvm.org/D125009