[PATCH] D112406: [Driver][AArch64]Add driver support for neoverse-512tvb target
Paul Walker via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 26 06:45:56 PDT 2021
paulwalker-arm added a comment.
In D112406#3087191 <https://reviews.llvm.org/D112406#3087191>, @dmgreen wrote:
> Thanks. If the cpu has a 512 bit total vector bandwidth, should the VScaleForTuning be 1 or 2 (or higher)? llvm doesn't usually deal with total bandwidth a lot, perhaps not as much as it should.
>
> @david-arm any thoughts?
The total vector bandwidth includes unrolling so currently having `VScaleForTuning=1` and `MaxInterleaveFactor=4` implies 512 tvb. If the target has >128bit vectors then vector loops will likely have more work than they can handle in parallel but as long as that does not negatively affect register pressure it shouldn't be a problem.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D112406/new/
https://reviews.llvm.org/D112406
More information about the llvm-commits
mailing list