frederick-vs-ja wrote: I think we may need to move `__block_size` into function bodies (and repeat it many times) to support incomplete element types and be optimization-friendly. https://github.com/llvm/llvm-project/pull/89422