LittleMeepo wrote: I also achieved a similar function by adding recipe to LoopVectorize: But I think the method of directly generating aarch64 intrinsic in LoopVectorize can only be used as a local temporary solution. https://github.com/llvm/llvm-project/pull/69587