[llvm] r215685 - [PPC64] Add missing dependency on X2 to LDinto_toc.
Bill Wendling
isanbard at gmail.com
Sun Aug 17 22:25:39 PDT 2014
Done.
-bw
On Aug 15, 2014, at 4:07 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
>> From: "İsmail Dönmez" <ismail at donmez.ws>
>> To: "Bill Schmidt" <wschmidt at linux.vnet.ibm.com>
>> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>
>> Sent: Thursday, August 14, 2014 10:47:28 PM
>> Subject: Re: [llvm] r215685 - [PPC64] Add missing dependency on X2 to LDinto_toc.
>>
>> Cc'ing Bill Wendling. This should be in 3.5 because at least on
>> (open)SUSE we bootstrap clang on PPC64.
>
> Bill,
>
> This is approved. Please pull this commit into the release branch! (it fixes self-hosting).
>
> Thanks again,
> Hal
>
>>
>> On Aug 15, 2014 4:59 AM, "Bill Schmidt" < wschmidt at linux.vnet.ibm.com
>>> wrote:
>>
>>
>> Author: wschmidt
>> Date: Thu Aug 14 20:25:26 2014
>> New Revision: 215685
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=215685&view=rev
>> Log:
>> [PPC64] Add missing dependency on X2 to LDinto_toc.
>>
>> The LDinto_toc pattern has been part of 64-bit PowerPC for a long
>> time, and represents loading from a memory location into the TOC
>> register (X2). However, this pattern doesn't explicitly record that
>> it modifies that register. This patch adds the missing dependency.
>>
>> It was very surprising to me that this has never shown up as a
>> problem
>> in the past, and that we only saw this problem recently in a single
>> scenario when building a self-hosted clang. It turns out that in most
>> cases we have another dependency present that keeps the LDinto_toc
>> instruction tied in place. LDinto_toc is used for TOC restore
>> following a call site, so this is a typical sequence:
>>
>> BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>,
>> %X12<imp-use>, %X1<imp-def>, ...
>> LDinto_toc 24, %X1
>> ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>
>>
>> Because the LDinto_toc is inserted prior to the ADJCALLSTACKUP, there
>> is a natural anti-dependency between the two that keeps it in place.
>>
>> Therefore we don't usually see a problem. However, in one particular
>> case, one call is followed immediately by another call, and the
>> second
>> call requires a parameter that is a TOC-relative address. This is the
>> code sequence:
>>
>> BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>,
>> %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
>> LDinto_toc 24, %X1
>> ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>
>> ADJCALLSTACKDOWN 96, %R1<imp-def>, %R1<imp-use>
>> %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
>> %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40
>> G8RC_and_G8RC_NOX0:%vreg39
>>
>> Note that the back-to-back stack adjustments are the same size! The
>> back end is smart enough to recognize this and optimize them away:
>>
>> BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>,
>> %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
>> LDinto_toc 24, %X1
>> %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
>> %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40
>> G8RC_and_G8RC_NOX0:%vreg39
>>
>> Now there is nothing to prevent the ADDIStocHA instruction from
>> moving
>> ahead of the LDinto_toc instruction, and because of the longest-path
>> heuristic, this is what happens.
>>
>> With the accompanying patch, %X2 is represented as an implicit def:
>>
>> BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>,
>> %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
>> LDinto_toc 24, %X1, %X2<imp-def,dead>
>> ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use>
>> ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use>
>> %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
>> %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40
>> G8RC_and_G8RC_NOX0:%vreg39
>>
>> So now when the two stack adjustments are removed, ADDIStocHA is
>> prevented from being moved above LDinto_toc.
>>
>> I have not yet created a test case for this, because the original
>> failure occurs on a relatively large function that needs reduction.
>> However, this is a fairly serious bug, despite its infrequency, and I
>> wanted to get this patch onto the list as soon as possible so that it
>> can be considered for a 3.5 backport. I'll work on whittling down a
>> test case.
>>
>> Have we missed the boat for 3.5 at this point?
>>
>> Thanks,
>> Bill
>>
>> Modified:
>> llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td
>>
>> Modified: llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td?rev=215685&r1=215684&r2=215685&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td (original)
>> +++ llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td Thu Aug 14
>> 20:25:26 2014
>> @@ -802,7 +802,7 @@ def LDtocCPT: Pseudo<(outs g8rc:$rD), (i
>> [(set i64:$rD,
>> (PPCtoc_entry tconstpool:$disp, i64:$reg))]>, isPPC64;
>>
>> -let hasSideEffects = 1, isCodeGenOnly = 1, RST = 2 in
>> +let hasSideEffects = 1, isCodeGenOnly = 1, RST = 2, Defs = [X2] in
>> def LDinto_toc: DSForm_1<58, 0, (outs), (ins memrix:$src),
>> "ld 2, $src", IIC_LdStLD,
>> [(PPCload_toc ixaddr:$src)]>, isPPC64;
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140817/c40dd4a2/attachment.html>
More information about the llvm-commits
mailing list