sparker-arm wrote: Performance wise, this looks really good for ML. Running NCNN, via V8 on AArch64, I observe a mean execution time reduction of ~15% over >30 workloads. https://github.com/llvm/llvm-project/pull/161355