[PATCH] D112406: [Driver][AArch64]Add driver support for neoverse-512tvb target
Dave Green via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Oct 28 00:49:36 PDT 2021
dmgreen added a comment.
> The total vector bandwidth includes unrolling so currently having `VScaleForTuning=1` and `MaxInterleaveFactor=4` implies 512 tvb. If the target has >128bit vectors then vector loops will likely have more work than they can handle in parallel but as long as that does not negatively affect register pressure it shouldn't be a problem.
That doesn't fit with my understanding of how VScaleForTuning is currently used, and vectorizing/unrolling too far can easily cause the vector part to be skipped for many loop counts, falling back to the scalar part. But that all sounds fine to me for what this is. Cheers.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D112406/new/
https://reviews.llvm.org/D112406
More information about the cfe-commits
mailing list