[PATCH] D112406: [Driver][AArch64]Add driver support for neoverse-512tvb target

Paul Walker via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 26 06:45:56 PDT 2021


paulwalker-arm added a comment.

In D112406#3087191 <https://reviews.llvm.org/D112406#3087191>, @dmgreen wrote:

> Thanks. If the cpu has a 512 bit total vector bandwidth, should the VScaleForTuning be 1 or 2 (or higher)? llvm doesn't usually deal with total bandwidth a lot, perhaps not as much as it should.
>
> @david-arm any thoughts?

The total vector bandwidth includes unrolling so currently having `VScaleForTuning=1` and `MaxInterleaveFactor=4` implies 512 tvb.  If the target has >128bit vectors then vector loops will likely have more work than they can handle in parallel but as long as that does not negatively affect register pressure it shouldn't be a problem.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112406/new/

https://reviews.llvm.org/D112406



More information about the cfe-commits mailing list