[PATCH] D69737: [PGO][PGSO] TargetInstrInfo part.

Tue Nov 5 13:43:05 PST 2019

yamauchi marked an inline comment as done.
yamauchi added inline comments.

================
Comment at: llvm/include/llvm/CodeGen/TargetInstrInfo.h:136-138
+  virtual MachineInstr *commuteInstructionImpl(MachineInstr &MI,
+                                               ProfileSummaryInfo *PSI,
+                                               const MachineBlockFrequencyInfo *MBFI,
----------------
davidxl wrote:
> arsenm wrote:
> > yamauchi wrote:
> > > arsenm wrote:
> > > > yamauchi wrote:
> > > > > arsenm wrote:
> > > > > > Why is this needed for commuting?
> > > > > There isn't anything particular about commuting, but wherever we make a size-vs-speed code gen choice (one in X86InstrInfo.cpp), and if we want to do it in a profile guided manner, we'd need to propagate profile (PSI/MBFI) down.
> > > > I don’t think it’s appropriate to pass this in to commuteInstruction. It should just perform the commute and not concern itself with profitability. It would be the caller’s responsibility to determine if commuting is a good idea 
> > > Around X86InstrInfo.cpp:1576,
> > > 
> > >   case X86::BLENDPDrri:
> > >   case X86::BLENDPDrri:
> > >   case X86::BLENDPSrri:
> > >   case X86::VBLENDPDrri:
> > >   case X86::VBLENDPSrri: {
> > >     // If we're optimizing for size, try to use MOVSD/MOVSS.
> > >     if (MI.getParent()->getParent()->getFunction().hasOptSize()) {
> > >        ....
> > >     }
> > > 
> > > this code tries to replace BLEND with MOVSD, if possible, under optsize.
> > > 
> > > This is unfortunately internal to Target/X86InstrInfo (TII) and isn't up to the caller, except X86InstrInfo internally checks for optsize on the function.
> > > 
> > > This patch would like to make it profile guided (eg. do this for cold code, even without optsize) and it needs access to the profile (PSI/MBFI) there **in some way**.
> > > 
> > > It doesn't seem like a good idea to pass the profile to TII per pass or function because it is TII is created once and stateless.
> > > 
> > > There doesn't seem a good way to access the profile directly from TII because the profile isn't attached to the function or the IR.
> > > 
> > > Do you see some other way to accomplish this?
> > > 
> > > We could consider instead passing "bool OptSize" to commuteInstruction. What do you think?
> > > 
> > This usage of OptSize seems broken in the first place. commuteInstruction is used, for example, in MachineCSE. This would mean optsize can potentially disable CSE for some operations, which will increase instruction count (and therefore code size) by blocking the commute.
> > 
> > AMDGPU also has situations for example where.a smaller encoding is possible, but we have a pass that uses this based on when it makes sense to switch encodings. commuteInstruction isn't making this decision.
> I agree with Matt.   Can we remove this change for commute and see if it matters for size?   The existing code is broken and we don't want this badness to be exposed at interface level.
Yeah, it looks like some CSE between blend and movsd/movss could indeed be missed depending on whether optsize.

This code came from https://reviews.llvm.org/rL336731. It may be alternatively fine to do a blend -> movsd/movss replacement in a separate pass if optsize.

Thoughts?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69737/new/

https://reviews.llvm.org/D69737