[llvm] [X86] Combine LRINT/LLRINT and TRUNC when TRUNC has nsw flag (PR #126217)
Phoebe Wang via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 10 21:58:27 PST 2025
phoebewang wrote:
> > > I was also wondering if it would be reasonable to apply this transformation in InstCombine -- for instance, convert `llvm.lrint.i64.f64+(nsw/nuw)trunc` to `llvm.rint.i32`? That's assuming we have reason to believe that this is useful.
> >
> >
> > The concern of putting it in the middle end is we don't know if a target prefers to use small size lrint and to which size it prefers the most. I'm afraid backend may geneate suboptimal code if we arbitrarily combine to `llvm.rint.i16` or `llvm.rint.i8`.
>
> We could add TTI functions to get additional information if needed. Your suggestion that this could potentially enable better vectorization is the reason I think it would be beneficial to have this in InstCombine. If we leave it to codegen, it may be too late for vectorization. That is, the presence of the trunc instruction may cause the vectorizer to give up.
The vectorization is a good point, and I agree we probably fail to vectorize `llvm.lrint.i64.fxx` due to the cost (we don't have LRINT cost model for now, but the vector cost is high for proir AVX512DQ target if we add them). It makes the solution of https://github.com/llvm/llvm-project/pull/126477 more valuable because the vector cost of both FRINT and FP_TO_SINT is low, at least for SSE4.1 and later, though we haven't model them either.
https://github.com/llvm/llvm-project/pull/126217
More information about the llvm-commits
mailing list