[PATCH] D114800: [PowerPC] Replace MFVSRLD with MFVSRD when the vector is symmetrical

Stefan Pintilie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 21 07:39:44 PST 2022


stefanp added a comment.

In D114800#3254289 <https://reviews.llvm.org/D114800#3254289>, @stefanp wrote:

> In D114800#3253393 <https://reviews.llvm.org/D114800#3253393>, @qiucf wrote:
>
>> In D114800#3197765 <https://reviews.llvm.org/D114800#3197765>, @stefanp wrote:
>>
>>> In D114800#3163072 <https://reviews.llvm.org/D114800#3163072>, @shchenz wrote:
>>>
>>>> Can we fix this at the place where `MFVSRLD` is generated? (in DAG-ISEL?)
>>>
>>> I see what you mean. This can be done in DAG-ISEL. 
>>> However, there are a couple of reasons why I chose to do this here.
>>>
>>> 1. I think it is easier to do here and the code is simpler to understand. We are looking to replace `MFVSRLD` so we just look for that instruction and then replace it. In DAG ISEL we would have to first figure out which extract element node eventually gets turned into this instruction and then see if it can be replaced.
>>> 2. There are situations where the choice of preceding instructions matters to whether or not we can use the other doubleword. Again, it would be a situation where we would have to figure out what a node is turned into before deciding if the doubleword is symmetrical.
>>
>> Will putting code in post-isel peephole help? See D97658 <https://reviews.llvm.org/D97658>.
>
> I think I see what you mean based on that patch. I could try to move it there and re-use `isVSXSwap`. I will try to move the code there and see what I get.

So, I ended up also fully implementing this in post-isel peephole and then I compared the implementations. The conclusion I came to is that I would prefer to leave the implementation here in the MIPeephole.

Now, here is the long answer.
I took Nemanja's comment into consideration:

> Something to consider when it comes to deciding where to do something like this is that any work we do on the SDAG:
>
> 1. Will be basic-block local obviously
> 2. Will not be useful in the future when we switch to GISel

For point number 1. the basic-block local aspect of the ISel actually causes the ISel implementation to miss opportunities. To demonstrate this I added this test: `ppc64-mfvsrld-removal.ll`.

If you look at that test in ISel the Basic Block that has the `MFVSRLD` instruction looks like this:

  SelectionDAG has 8 nodes:
    t0: ch = EntryToken
          t2: v2i64,ch = CopyFromReg t0, Register:v2i64 %0
        t4: i64 = MFVSRLD t2
      t6: ch = CopyToReg t0, Register:i64 %1, t4
    t8: ch = B BasicBlock:ch<cleanup 0x10024b22b10>, t6

There isn't enough history in that BB to allow us to transform the instruction.
On the other hand, in MIPeephole we can see where the values are coming from:

  # Machine code for function getVecSplit: IsSSA, TracksLiveness
  Function Live Ins: $x3 in %4, $v2 in %5, $v3 in %6
  
  bb.0.entry:
    successors: %bb.2(0x30000000), %bb.1(0x50000000); %bb.2(37.50%), %bb.1(62.50%)
    liveins: $x3, $v2, $v3
    %6:vrrc = COPY $v3
    %5:vrrc = COPY $v2
    %4:g8rc = COPY $x3
    %7:gprc = COPY %4.sub_32:g8rc
    %8:vrrc = XXPERMDI %5:vrrc, %5:vrrc, 2
    %0:vrrc = VADDUDM killed %8:vrrc, %5:vrrc     <---- Comes from here
    %9:crrc = CMPLWI killed %7:gprc, 0
    BCC 76, killed %9:crrc, %bb.2
    B %bb.1
  
  bb.1.if.then:
  ; predecessors: %bb.0
    successors: %bb.3(0x80000000); %bb.3(100.00%)
  
    %1:g8rc = MFVSRLD %0:vrrc
    B %bb.3
  
  bb.2.if.else:
  ; predecessors: %bb.0
    successors: %bb.3(0x80000000); %bb.3(100.00%)
  
    %10:vrrc = VADDUDM %0:vrrc, %6:vrrc
    ADJCALLSTACKDOWN 32, 0, implicit-def dead $r1, implicit $r1
    $v2 = COPY %10:vrrc
    BL8_NOP @callee, <regmask $cr2 $cr3 $cr4 $f14 $f15 $f16 $f17 $f18 $f19 $f20 $f21 $f22 $f23 $f24 $f25 $f26 $f27 $f28 $f29 $f30 $f31 $r14 $r15 $r16 $r17 $r18 $r19 $r20 $r21 $r22 $r23 $r24 $r25 and 60 more...>, implicit-def dead $lr8, implicit $rm, implicit $v2, implicit $x2, implicit-def $r1, implicit-def $x3
    ADJCALLSTACKUP 32, 0, implicit-def dead $r1, implicit $r1
    %11:g8rc_and_g8rc_nox0 = COPY $x3
    %2:g8rc = nsw ADDI8 %11:g8rc_and_g8rc_nox0, 42
  
  bb.3.cleanup:
  ; predecessors: %bb.1, %bb.2
  
    %3:g8rc = PHI %1:g8rc, %bb.1, %2:g8rc, %bb.2
    $x3 = COPY %3:g8rc
    BLR8 implicit $lr8, implicit $rm, implicit $x3
  
  # End machine code for function getVecSplit.

Therefore for this test we only catch the opportunity if we check for it in MIPeephole.

Also there are other reasons why I wanted to keep the patch as-is:

1. As Nemanja mentioned we may be moving to GISel and so this work would have to be re-done in that situation if we did it in ISel.
2. In ISel there are a number of nodes that we have to "look though" like COPY_TO_REGCLASS that makes the code more complicated. In MIPeephole a lot of those nodes have now been simplified away.
3. I didn't see any advantages to doing this earlier as I don't think that this transformation will make much of a difference to passes that run between ISel and MIPeephole.

I hope I explained this ok. Let me know if there are more questions about this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114800/new/

https://reviews.llvm.org/D114800



More information about the llvm-commits mailing list