[PATCH] D112406: [Driver][AArch64]Add driver support for neoverse-512tvb target

Dave Green via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Oct 28 00:49:36 PDT 2021


dmgreen added a comment.

> The total vector bandwidth includes unrolling so currently having `VScaleForTuning=1` and `MaxInterleaveFactor=4` implies 512 tvb.  If the target has >128bit vectors then vector loops will likely have more work than they can handle in parallel but as long as that does not negatively affect register pressure it shouldn't be a problem.

That doesn't fit with my understanding of how VScaleForTuning is currently used, and vectorizing/unrolling too far can easily cause the vector part to be skipped for many loop counts, falling back to the scalar part. But that all sounds fine to me for what this is. Cheers.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112406/new/

https://reviews.llvm.org/D112406



More information about the cfe-commits mailing list