[PATCH] D66666: [AMDGPU] Remove unnecessary movs for v_fmac operands

Ryan Taylor via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 28 12:01:58 PDT 2019


rtaylor marked an inline comment as done.
rtaylor added inline comments.


================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:2630-2633
+  if (!Src0Mods && !Src1Mods && !Clamp && !Omod &&
+      (ST.getConstantBusLimit(Opc) > 1 ||
+       !Src0->isReg() ||
+       !RI.isSGPRReg(MBB->getParent()->getRegInfo(), Src0->getReg()))) {
----------------
rtaylor wrote:
> nhaehnle wrote:
> > rtaylor wrote:
> > > arsenm wrote:
> > > > rtaylor wrote:
> > > > > arsenm wrote:
> > > > > > These are the exact conditions as checked above
> > > > > Yes, it is. I could create a local function that does this and replace both with that, it would be just as ugly since there are so many conditions (params) to pass, or I could pass MI and re-get all those operands, which would be the exact code that is already in the function.
> > > > Why can't you just set a bool flag inside the first condition?
> > > > 
> > > > Anyway, I would prefer if the wasn't done here at all, and SIFoldOperands took care of this
> > > Sure, no problem, if you think that looks better.
> > > 
> > > This test case doesn't seem to be handled by SIFoldOperands (V_FMAC before and V_FMAC after) SIFoldOperands, so doing this change there wouldn't do anything.
> > `ConstantBus` really shouldn't be assigned inside the if condition itself. I assume @arsenm meant to set that variable in the body.
> > 
> > That said, I also agree that this should happen in SIFoldOperands, and I don't follow your argument here. SIFoldOperands can already do this, e.g. consider
> > ```
> > define amdgpu_cs float @test2(float inreg %a, float %b, float %y, float %z) {
> > entry:
> >   %fma1 = call float @llvm.fma.f32(float %y, float %z, float %a) #3
> >   ret float %fma1
> > }
> > ```
> > The FMAC becomes an FMA in SIFoldOperands for this example. It's possible that the reason why it doesn't work in your case is that that code doesn't handle sub-registers.
> I took the "inside the first condition" literally.
> 
> Yes, you are correct, it seems it's because of subregs use in this test case.
Looks like this was done on purpose, not folding a sub register to a fully defined tied register. 


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66666/new/

https://reviews.llvm.org/D66666





More information about the llvm-commits mailing list