[llvm] r215685 - [PPC64] Add missing dependency on X2 to LDinto_toc.

Hal Finkel hfinkel at anl.gov
Fri Aug 15 04:07:25 PDT 2014


----- Original Message -----
> From: "İsmail Dönmez" <ismail at donmez.ws>
> To: "Bill Schmidt" <wschmidt at linux.vnet.ibm.com>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>
> Sent: Thursday, August 14, 2014 10:47:28 PM
> Subject: Re: [llvm] r215685 - [PPC64] Add missing dependency on X2 to	LDinto_toc.
> 
> Cc'ing Bill Wendling. This should be in 3.5 because at least on
> (open)SUSE we bootstrap clang on PPC64.

Bill,

This is approved. Please pull this commit into the release branch! (it fixes self-hosting).

Thanks again,
Hal

> 
> On Aug 15, 2014 4:59 AM, "Bill Schmidt" < wschmidt at linux.vnet.ibm.com
> > wrote:
> 
> 
> Author: wschmidt
> Date: Thu Aug 14 20:25:26 2014
> New Revision: 215685
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=215685&view=rev
> Log:
> [PPC64] Add missing dependency on X2 to LDinto_toc.
> 
> The LDinto_toc pattern has been part of 64-bit PowerPC for a long
> time, and represents loading from a memory location into the TOC
> register (X2). However, this pattern doesn't explicitly record that
> it modifies that register. This patch adds the missing dependency.
> 
> It was very surprising to me that this has never shown up as a
> problem
> in the past, and that we only saw this problem recently in a single
> scenario when building a self-hosted clang. It turns out that in most
> cases we have another dependency present that keeps the LDinto_toc
> instruction tied in place. LDinto_toc is used for TOC restore
> following a call site, so this is a typical sequence:
> 
> BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>,
> %X12<imp-use>, %X1<imp-def>, ...
> LDinto_toc 24, %X1
> ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>
> 
> Because the LDinto_toc is inserted prior to the ADJCALLSTACKUP, there
> is a natural anti-dependency between the two that keeps it in place.
> 
> Therefore we don't usually see a problem. However, in one particular
> case, one call is followed immediately by another call, and the
> second
> call requires a parameter that is a TOC-relative address. This is the
> code sequence:
> 
> BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>,
> %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
> LDinto_toc 24, %X1
> ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>
> ADJCALLSTACKDOWN 96, %R1<imp-def>, %R1<imp-use>
> %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
> %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40
> G8RC_and_G8RC_NOX0:%vreg39
> 
> Note that the back-to-back stack adjustments are the same size! The
> back end is smart enough to recognize this and optimize them away:
> 
> BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>,
> %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
> LDinto_toc 24, %X1
> %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
> %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40
> G8RC_and_G8RC_NOX0:%vreg39
> 
> Now there is nothing to prevent the ADDIStocHA instruction from
> moving
> ahead of the LDinto_toc instruction, and because of the longest-path
> heuristic, this is what happens.
> 
> With the accompanying patch, %X2 is represented as an implicit def:
> 
> BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>,
> %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
> LDinto_toc 24, %X1, %X2<imp-def,dead>
> ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use>
> ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use>
> %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
> %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40
> G8RC_and_G8RC_NOX0:%vreg39
> 
> So now when the two stack adjustments are removed, ADDIStocHA is
> prevented from being moved above LDinto_toc.
> 
> I have not yet created a test case for this, because the original
> failure occurs on a relatively large function that needs reduction.
> However, this is a fairly serious bug, despite its infrequency, and I
> wanted to get this patch onto the list as soon as possible so that it
> can be considered for a 3.5 backport. I'll work on whittling down a
> test case.
> 
> Have we missed the boat for 3.5 at this point?
> 
> Thanks,
> Bill
> 
> Modified:
> llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td?rev=215685&r1=215684&r2=215685&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td Thu Aug 14
> 20:25:26 2014
> @@ -802,7 +802,7 @@ def LDtocCPT: Pseudo<(outs g8rc:$rD), (i
> [(set i64:$rD,
> (PPCtoc_entry tconstpool:$disp, i64:$reg))]>, isPPC64;
> 
> -let hasSideEffects = 1, isCodeGenOnly = 1, RST = 2 in
> +let hasSideEffects = 1, isCodeGenOnly = 1, RST = 2, Defs = [X2] in
> def LDinto_toc: DSForm_1<58, 0, (outs), (ins memrix:$src),
> "ld 2, $src", IIC_LdStLD,
> [(PPCload_toc ixaddr:$src)]>, isPPC64;
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory




More information about the llvm-commits mailing list