[PATCH] Adds Cortex-A53 and Cortex-A57 subtargets.

Hal Finkel hfinkel at anl.gov
Wed Feb 26 15:35:46 PST 2014


----- Original Message -----
> From: "Andrew Trick" <atrick at apple.com>
> To: "Dave Estes" <cestes at codeaurora.org>
> Cc: "Jiangning Liu" <Jiangning.Liu at arm.com>, "llvm commits" <llvm-commits at cs.uiuc.edu>
> Sent: Wednesday, February 26, 2014 5:28:22 PM
> Subject: Re: [PATCH] Adds Cortex-A53 and Cortex-A57 subtargets.
> 
> 
> On Feb 26, 2014, at 2:58 PM, Dave Estes <cestes at codeaurora.org>
> wrote:
> 
> > On 02/25/2014 05:29 PM, Andrew Trick wrote:
> >> First off, in case it isn't clear, the machine model will only be
> >> used by the MachineScheduler. To enable the MachineScheduler pass
> >> your target must implement
> >> TargetSubtargetInfo::enableMachineScheduler.
> >> 
> >> I can't comment on these processors at all, so I'll just give some
> >> general advice.
> >> 
> > 
> > Few more details on AArch64 for the reviewers:
> > 
> > The MachineScheduler is turned on by default for the AArch64.
> > AArch64Subtarget::enableMachineScheduler() is implemented to
> > return true.
> > 
> > I noticed the other day that the MachineScheduler can get latency
> > information from legacy itineraries as well as from the machine
> > model. The AArch64 backend currently has not subtargets that
> > implement legacy instruction itineraries, so
> > TargetSchedModel::hasInstrSchedModel() will return false even
> > though the scheditins command line option is defaulted to true. On
> > the other hand, TargetSchedModel::hasInstrSchedModel() returns
> > true only for the Cortex-A53 subtarget since that's the only
> > machine model defined for the AArch64 backend so far.
> > 
> > This all means that the MachineScheduler pass runs for all AArch64
> > compilations, but that it works valiantly without latency
> > information most of the time until someone specifies
> > -mcpu=cortex-a53.
> 
> Right, the way to change that, if you wanted to, is check for the
> subtarget type inside enableMachineScheduler.
> 
> >> The default is MicroOpBufferSize=0, which is suitable for in-order
> >> execution in that it prioritizes latency hiding. Setting it to 1
> >> makes it less strict by allowing latency to be balanced with
> >> other heuristics:
> >> 
> >> def CortexA53Model {
> >> ...
> >>   let MicroOpBufferSize = 1
> >> }
> > 
> > Sweet, I didn't realize that. I've left it at the default, but I'll
> > definitely run some experiments with MicroOpBufferSize=1 once the
> > model is more fully implemented. BTW, is the default 0 or -1? I
> > saw -1 in TargetSchedule.td.
> 
> The options are confusingly documented in two places. The actual
> defaults come from MCSchedule.h. Using “-1” in the .td file just
> says that the target does not override the default. I tried to
> comment this stuff. More comments are welcome.
> 
> For anyone porting the machine model I would say, do not assume
> things are working as you expect when you gather performance data.
> Please verify with a scheduler trace on some test cases first.

+1

> 
> >> To model in-order resource conflicts, if you care about that, you
> >> actually need to explicitly set BufferSize=0 in the processor
> >> resource def:
> >> 
> >> def A53UnitXX  : ProcResource<1> { let BufferSize = 0; }
> >> 
> >> That prevents multiple instructions sharing the same resource from
> >> being scheduled in the same cycle.
> > 
> > Thanks for this. It prompted me to re-evaluate my understanding of
> > BufferSize...for half of the day. :)
> > 
> > I mistakenly thought that I could define ProcResources that are
> > "pipelined", which is why for this simple processor, I'm modeling
> > each pipeline as a resource. I had a disconnect between Latency
> > and ResourceCycles. My plan now us to make each of these
> > ProcResources unbuffered (BufferSize=0) and to set ResourceCycles
> > to 1 (if needed). For instructions that cause hazards in their
> > pipelines, I'll set a ResourceCycles value that matches the length
> > of the hazard.
> 
> ResourceCycles=1 should be the default.
> 
> I only recently added support for counting stalls based on resource
> conflicts and it is not well tested. Hal Finkel was going to use
> this for PowerPC. Until then, in-order targets were all using
> itineraries.

I still am planning to do this, I've just not gotten to it quite yet.

 -Hal

> 
> -Andy
> 
> > 
> > Thanks, Andy.
> > -Dave
> > 
> > --
> > Employee of Qualcomm Innovation Center, Inc.
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> > hosted by The Linux Foundation
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory




More information about the llvm-commits mailing list