[llvm-dev] Performance degradation on ARMv7 (cortex-a9)
Bradley Smith via llvm-dev
llvm-dev at lists.llvm.org
Wed Feb 24 02:42:21 PST 2016
The idea behind that change was to make ARM.td clearer, that is, adding architecture features to new architecture subtarget features, and to have the CPUs inherit from this. ProcA9 (and similar) from what I could tell were only being used for their enum value in making codegen decisions, hence I moved all of the features they inherit over to the actual CPUs for clarity, the idea being that all features a given target uses come from a combination of the architecture it inherits from and the target itself, not any intermediary features like ProcA9.
I’m not aware of any place where ProcA9 is getting used to get subtarget features like this, and after a quick look I still can’t find anything. Where exactly are you seeing ProcA9 being used to get features? Even so, the cortex-a9 processer model itself inherits FeatureFP16 now so I would expect it to use FP16, unless you’re not using cortex-a9 directly? (In which case all CPUs that used to inherit ProcA9 now need to inherit all of the features ProcA9 used to inherit as well as ProcA9, which is what I did in the change you mention).
From: Grang, Mandeep Singh [mailto:mgrang at codeaurora.org]
Sent: 24 February 2016 03:16
To: Bradley Smith
Cc: llvm-dev at lists.llvm.org
Subject: Performance degradation on ARMv7 (cortex-a9)
I was doing some performance analysis for ARMv7 (cortex-a9) and I noticed that one of my benchmarks degraded by 93%. I have tracked the regression down to the following commit by you:
Author: Bradley Smith <bradley.smith at arm.com><mailto:bradley.smith at arm.com>
Date: Mon Nov 16 11:10:19 2015 +0000
[ARM] Introduce subtarget features per ARM architecture.
This allows for accurate architecture targeting as well as removing
duplicate information (hardcoded feature strings) from MCTargetDesc.
I see that in lib/Target/ARM/ARM.td all the features have been removed from Proc definition (e.g.: ProcA9) and added to ProcessorModel definition (e.g.: ProcessorModel<"cortex-a9").
But I find that the features from Proc are still being read and set in MCSubtargetInfo through the ARMFeatureKV table. So if the Proc is empty the corresponding feature is not being set.
In my case, if I add FeatureFP16 back to the ProcA9 definition in ARM.td I get back all the lost performance.
Could you please give me some insight on how, after your change, do the Proc features get correctly set in MCSubtargetInfo and other places which access Proc?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev