[PATCH] D54882: [AMDGPU] Add sdwa support for ADD|SUB U64 decomposed Pseudos

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Dec 2 18:18:10 PST 2018


rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIPeepholeSDWA.cpp:914
+  // Check if VCC is referenced in range of (MI,MISucc].
+  for (MachineBasicBlock::const_iterator I = &*std::next(MI.getIterator());
+       I != MISucc; ++I) {
----------------
ronlieb wrote:
> AlexVlx wrote:
> > ronlieb wrote:
> > > AlexVlx wrote:
> > > > ronlieb wrote:
> > > > > ronlieb wrote:
> > > > > > rampitec wrote:
> > > > > > > Ugh.. Why do you need an interator from iterator?! This may even fail if next is end(). std::next() already returns you an iterator.
> > > > > > The short answer is i have not found another mechanism to increment I to the next MI, that both compiles and does not abort.
> > > > > > 
> > > > > > In this particular context, MI and MISucc are both references to valid instructions in the basic block, and further that MISucc depends on MI,so its after MI in the instruction iteration order. 
> > > > > > 
> > > > > > This loops is not executed unless we have found both MI and MISucc, so that means we will not see MBB.end() as we will give up upon encountering MISucc.
> > > > > > 
> > > > > > Further , in practice, MI and MISucc are very close to each other, if not actually next o each other.
> > > > > > 
> > > > > > All that said, the savings from calling modifiesRegister one extra time is hopefully not too expensive in practice.
> > > > > > 
> > > > > > My preference at this point is to use one of the two following:
> > > > > > personally, i think #1 is the simplest.
> > > > > > 
> > > > > > 1) 
> > > > > > +  // Check if VCC is referenced in range of MI and MISucc.
> > > > > > +  for (MachineBasicBlock::const_iterator I = MI; I != MISucc;
> > > > > > +       I = std::next(I)) {
> > > > > > +    if (I->modifiesRegister(AMDGPU::VCC, TRI))
> > > > > > 
> > > > > > 2)
> > > > > > +  // Check if VCC is referenced in range of (MI,MISucc].
> > > > > > +  MachineBasicBlock::const_iterator I = MI;
> > > > > > +  for (++I; I != MISucc; ++I) {
> > > > > > +    if (I->modifiesRegister(AMDGPU::VCC, TRI))
> > > > > > 
> > > > > > 
> > > > > i need to use iterator instead of const_iterator
> > > > > 
> > > > >   // Check if VCC is referenced in range of (MI,MISucc].
> > > > >   for (MachineBasicBlock::iterator I = std::next(MI.getIterator());
> > > > >        I != MISucc; ++I) {
> > > > > 
> > > > Hello Ron. It is probably more hygienic (and may have spared you the pain here) to use auto in this case, since it's one of the instances where it shines. Your loop would become `for (auto I = std::next(MI.getIterator()); I != MISucc; ++I) {...}`, which also makes it robust against `MI.getIterator()` changing its return type (as unlikely as that'd be).
> > > a good suggestion, thanks. It needs to be this to compile 
> > > (MISucc needs to be an iterator, and not recompute end condition on each iteration.)
> > > 
> > >   for (auto I = std::next(MI.getIterator()), E = MISucc.getIterator(); 
> > >        I != E; ++I) {
> > >  
> > Right, I ignored that. To be fair, at this point it may be worth considering just doing the following:
> > 
> > ```
> > const auto It = std::find_if(MI.getIterator(), MISucc.getIterator(), [](const MachineInstr &I) {
> >     return I.modifiesRegister(AMDGPU::VCC, TRI);
> > });
> > 
> > if (It != MISucc.getIterator()) return;
> > 
> > // OR
> > 
> > for (auto &&I : iterator_range{MI.getIterator(), MISucc.getIterator()}) {
> >     if (I.modifiesRegister(AMDGPU::VCC, TRI);
> >         return;
> > }
> > ```
> > 
> > But it's completely your call and mostly cosmetic at this point. Apologies for the side-trip:)
> I tried both suggestions, (side note: both need to access std::next)
> the first approach runs into issues with class pointer access TRI.
> The second one gave me syntactic fits.
> 
> Just now ,i see that Stats gave it the  highly coveted LGTM.
> so i am going run a PSDB on this patch, and then merge it i the morning.
In fact find_if looks cool, but does not simplify source, reduce number of lines or simplifies syntax analysis/optimization of the source.

Meanwhile, it is somewhat strange we have to replicate this code across many places in the llvm. After all it is pretty standard we need to check a phys reg can be used in between of two iterators, yet it is only available as a standard utility function during RA as checkInterference. What might be worth is to generalize and factor it into a common utility.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54882/new/

https://reviews.llvm.org/D54882





More information about the llvm-commits mailing list