[PATCH] D154083: [AMDGPU] Rematerialize scalar loads

Wed Jul 26 23:59:11 PDT 2023

piotr added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:111
+  if (isVOP1(MI) || isVOP2(MI) || isVOP3(MI) || isSDWA(MI) || isSALU(MI) ||
+      (isSMRD(MI) && MI.isDereferenceableInvariantLoad())) {
     // Normally VALU use of exec would block the rematerialization, but that
----------------
arsenm wrote:
> piotr wrote:
> > piotr wrote:
> > > piotr wrote:
> > > > arsenm wrote:
> > > > > arsenm wrote:
> > > > > > piotr wrote:
> > > > > > > arsenm wrote:
> > > > > > > > piotr wrote:
> > > > > > > > > Ironically, the addition of the check `isDereferenceableInvariantLoad` makes the examples that drive this work not handled, as the S_LOADs are not marked as `dereferenceable` at the moment (but that is another issue).
> > > > > > > > Don't really need the isSMRD with the isDereferenceableInvariantLoad. We should also be able to handle any load
> > > > > > > That was my intuition as well, but I added this extra check for these reasons:
> > > > > > > - the negative tests in remat-smrd.mir contain cases where the loads would be rematted otherwise e.g. test_no_remat_s_load_dword_not_dereferenceable
> > > > > > > - the generic hook that we bypass has a similar condition for loads (https://github.com/llvm/llvm-project/blob/c304be7cfdd2261811671feb252e31222365b475/llvm/lib/CodeGen/TargetInstrInfo.cpp#L1103)
> > > > > > > 
> > > > > > > Maybe checking for MMO->isInvariant() is enough? And that would mean the negative test would turn to a positive one.
> > > > > > > 
> > > > > > Would rematerialization ever happen in a context where the use didn't have the same dominating conditions as at the def? My intuition is that it couldn't, and thus the dereferenceable check (including the generic one) would be too strong 
> > > > > As a pre-commit can you post a patch to weaken the generic check? Best to be on the same page as all targets for this
> > > > Ok, will take a look.
> > > I looked closer at that. Checked that if I modify `TargetInstrInfo::isReallyTriviallyReMaterializableGeneric` with a relaxed check (copy of MI.isDereferenceableInvariantLoad with  `&& MMO->isDereferenceable()` removed) - then that makes no difference in the existing tests.
> > > X86 relies on further checks in that function "A load from a constant PseudoSourceValue is invariant".
> > > So while that boosted my confidence that the generic code does not rely on "dereferenceable", I am not sure if any changes to the generic check are practical. That would be pretty ugly, as I'd have to pretty much copy `isDereferenceableInvariantLoad` only to have `isDereferenceable()` check removed.
> > (And AMDGPU would not benefit from any of this as we rely on our own logic anyway.)
> I don't follow how it's difficult. It's straightforward relaxing of a restricition? I don't understand how the x86 case complicates matters (also, x86 shouldn't need to special case that instance either)
It is not difficult, but I am unsure if we really want to extend MachineInstr with isInvariantLoad() that would be a copy of isDereferenceableInvariantLoad() but with the relaxed condition. I can add a private common function to avoid duplicating the code, but I do not see how we can avoid adding a new function. We cannot just edit isDereferenceableInvariantLoad because it has more uses than just the one inside isReallyTriviallyReMaterializableGeneric(), and some of them really need isDereferenceable check.

As an alternative, instead of modifying MachineInstr I could just add a new private function in TargetInstrInfo, just for the use in isReallyTriviallyReMaterializableGeneric().

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154083/new/

https://reviews.llvm.org/D154083