[PATCH] D89697: * [x86] Implement smarter instruction lowering for FP_TO_UINT from vXf32 to vXi32 for SSE2 and AVX2 by using the exact semantic of the CVTTPS2SI instruction.
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 19 07:13:02 PDT 2020
RKSimon added a reviewer: andrew.w.kaylor.
RKSimon added a comment.
For the costs - ideally you need to run the code snippet through llvm-mca (you can do this in godbolt) for various cpus of that level (e.g. avx1 -> sandybridge/btver2/bdver2, avx2 -> znver2/haswell etc.) and use the worst case throughput cost of those runs. For older targets (sse2...) we're more limited on testable cpu targets, I tend to just use slm's costs as they tend to match weak pre-avx cpus).
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:21176
+ if (VT == MVT::v4i32 && SrcVT == MVT::v4f32 ||
+ VT == MVT::v8i32 && SrcVT == MVT::v8f32)
+ {
----------------
Avoid operator order warnings:
```
if ((VT == MVT::v4i32 && SrcVT == MVT::v4f32) ||
(VT == MVT::v8i32 && SrcVT == MVT::v8f32))
```
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:21201
+ DAG.getNode(ISD::AND, dl, VT, Big, IsOverflown));
+ }
+
----------------
clang format this block
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D89697/new/
https://reviews.llvm.org/D89697
More information about the llvm-commits
mailing list