topperc wrote: I believe what we talked about doing internally was a vp.zext to i8 then a i8 vp.merge in the loop with a vp.reduce.or after the loop. That avoids putting a vcpop.m in the loop. https://github.com/llvm/llvm-project/pull/131830