[MachineCopyPropagation] Handle undef flags conservatively so that we do not remove copies that are useful after breaking some hardware dependencies
qcolombet at apple.com
Thu May 28 14:50:34 PDT 2015
Thanks for the testcase.
I have to investigate a bit more, but I believe the copy propagation is not doing the right thing.
Indeed, the copy that is killed is used as undef, but only for the first few values, i.e., the value must be preserved as some point and the pass does check for that.
Let me check this is the issue and I’ll see how it can be fixed.
> On May 28, 2015, at 7:06 AM, Pierre-Andre Saulais <pierre-andre at codeplay.com> wrote:
> Hi Quentin,
> I think I have found a possible regression in LLVM that was revealed by one of your commits (r235647) which changes the handling of undefs by MachineCopyPropagation. This occurs on X86-64 with the following code:
> %vreg92<def> = IMPLICIT_DEF; VR128:%vreg92
> %vreg91<def,tied1> = PUNPCKLBWrr %vreg78<tied0>, %vreg92; VR128:%vreg91,%vreg78,%vreg92
> %vreg94<def> = IMPLICIT_DEF; VR128:%vreg94
> %vreg93<def,tied1> = PUNPCKLWDrr %vreg91<tied0>, %vreg94; VR128:%vreg93,%vreg91,%vreg94
> %vreg95<def,tied1> = PSLLDri %vreg93<tied0>, 31; VR128:%vreg95,%vreg93
> %vreg96<def,tied1> = PSRADri %vreg95<tied0>, 31; VR128:%vreg96,%vreg95
> Later on the IMPLICIT_DEFs are turned into <undef> which, after your changes, causes MachineCopyPropagation to remove one copy:
> MOVAPSmr %RSP, 1, %noreg, 16, %noreg, %XMM0<kill>; mem:ST16[FixedStack11]
> %XMM2<def> = MOVAPSrm %RSP, 1, %noreg, 160, %noreg; mem:LD16[FixedStack2]
> %XMM0<def> = KILL %XMM2 ; This was COPY before MachineCopyPropagation
> %XMM1<def> = COPY %XMM2
> %XMM2<def,tied1> = PUNPCKLBWrr %XMM2<kill,tied0>, %XMM0<undef>
> %XMM2<def,tied1> = PUNPCKLWDrr %XMM2<kill,tied0>, %XMM0<undef>
> %XMM2<def,tied1> = PSLLDri %XMM2<kill,tied0>, 31
> %XMM2<def,tied1> = PSRADri %XMM2<kill,tied0>, 31
> One of our test that was previously passing now fails, and the removed COPY is the only difference in the generated code I can see. Looking at only the code above it seems that the copy is not needed, which is strange.
> I have reduced the IR that exhibits this issue to a manageable size and created a .ll file for testing. When I run this file with lli using the interpreter it passes, same with the JIT on ARM. It fails however on X86-64 using the JIT. Reverting your changes, it passes on X86-64 using the JIT.
> Do you think that it's a bug in the X86 target that was revealed by your changes or that MachineCopyPropagation is doing something wrong?
> Pierre-Andre Saulais
> Principal Software Engineer (Compilers)
> Codeplay Software Ltd
> 45 York Place, Edinburgh, EH1 3HP
> Tel: 0131 466 0503
> Fax: 0131 557 6600
> Website: http://www.codeplay.com <http://www.codeplay.com/>
> Twitter: https://twitter.com/codeplaysoft <https://twitter.com/codeplaysoft>
> This email and any attachments may contain confidential and /or privileged information and is for use by the addressee only. If you are not the intended recipient, please notify Codeplay Software Ltd immediately and delete the message from your computer. You may not copy or forward it,or use or disclose its contents to any other person. Any views or other information in this message which do not relate to our business are not authorized by Codeplay software Ltd, nor does this message form part of any contract unless so stated.
> As internet communications are capable of data corruption Codeplay Software Ltd does not accept any responsibility for any changes made to this message after it was sent. Please note that Codeplay Software Ltd does not accept any liability or responsibility for viruses and it is your responsibility to scan any attachments.
> Company registered in England and Wales, number: 04567874
> Registered office: 81 Linkfield Street, Redhill RH1 6BY
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits