[PATCH] D40177: performance improvements for ThunderX2 T99

Mon Nov 27 15:21:53 PST 2017

MatzeB added a comment.

In https://reviews.llvm.org/D40177#936935, @steleman wrote:

> > Please avoid getProcFamily() checks outside of AArch64SubtargetInfo. Use target features and transfer the logic into SubtargetInfo!
>
> Can you please explain the rationale behind this restriction?

It's not a strict rule but unless there is a good reason not to do it we should :)

> It's certainly not supported by the class interface design, as the Subtarget pointer is accessible and Subtarget->getProcFamily() is accessible as well.
> 
> It also does not appear to be a real restriction at all, considering that Subtarget->getProcFamily() is already being used in AArch64ISelLowering.cpp.
> 
> What you are asking here is the implementation of a new target feature. This seems (a) unnecessary and (b) introduces unnecessary code complexity.
> 
> For starters, aggressive FMA is a compiler-dependent subjective decision. It's not a target-dependent objective attribute, such as LSE or Neon. What's //aggressive FMA// to LLVM is not aggressive at all to GCC for example.
> 
> The design implication is that, for every single existing public interface we are going to implement an additional, functionally duplicative, target feature. Which really isn't a target feature to being with.
> 
> The implementation implication is that, a one-line code change becomes a 10-line code change that accomplishes the same exact thing as the one-line change.
> 
> Thank you in advance for your explanation.

- I actively worked on changing those CPU specific checks/tweaks across the target code into target features, I'd like for this to stay this way.
- The name "target feature" is a certainly unfortunate as those aren't really features but rather tuning objectives. Nevertheless they are target specific so the target feature system works nice here.
- Some reasons as to why making feature bits out of them is a good thing:
  - Things often start out as an inconspicuous `if (CPU) doSomething()`. But in time more CPUs are added resulting in longish `if (CPU || CPUversion2 || CPUversion3 || otherVendorsCPU) doSomething()` scattered accross the code.
  - We suddenly start mixing code that deals with something like "more aggressive FMA folding" with small lists of CPUs that should use this. This is annoying when adding new CPUs: It is hard to spot all places where checks could/should be added. It feels a lot cleaner to do this in AArch64.td as a list of feature/tuning bits.
  - As a bonus, target features can be enabled/disabled with `-Xclang -target-feature=-xxx,+yyy` which is sometimes nice for compiler developers to experiment with.

Repository:
  rL LLVM

https://reviews.llvm.org/D40177