[PATCH] D111976: [DAG] Create fptosi.sat from clamped fptosi

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 28 02:12:15 PDT 2021


dmgreen added inline comments.


================
Comment at: llvm/test/CodeGen/AArch64/fpclamptosat_vec.ll:1226
+; CHECK-CVT-LABEL: stest_f16i16_mm:
+; CHECK-CVT:       // %bb.0: // %entry
+; CHECK-CVT-NEXT:    fcvtl2 v1.4s, v0.8h
----------------
efriedma wrote:
> This isn't an improvement in the case where we don't have a native half fcvtzs.
Yep. This unrolling is also blocking SVE, which I wanted to take a look into. I may just prevent the fold for half on non-half hardware in the meantime.


================
Comment at: llvm/test/CodeGen/X86/fpclamptosat.ll:1179
 entry:
   %conv = fptosi half %x to i128
   %spec.store.select = call i128 @llvm.smin.i128(i128 %conv, i128 9223372036854775807)
----------------
efriedma wrote:
> dmgreen wrote:
> > RKSimon wrote:
> > > Should we be taking into account the maximum/minimum representable value of the half type as well here?
> > Yeah I haven't looked into that at much yet. A fp16 only produces 17 (IIRC) bits of data, so anything larger than that doesn't really make much sense to try and saturate. I kept the test to keep the tests consistent and make sure nothing silly happened.  I wasn't sure if that was best optimized at the llvm level (maybe with some range analysis) or in DAG, but it felt like a different issue that I didn't try and address here.
> > 
> > Do you think it's worth doing something special with that here? Or removing the tests that don't make a huge amount of sense? Or leaving it as-is?
> For actual fptosi.sat, we can't really optimize based on the set of legal finite values because we need to handle +-inf.
> 
> Here, the smin/smax are redundant, but if we care, we should probably handle that using ComputeNumSignBits or something like that.  (But we probably don't care.)
Yeah, this would be an optimization for fptosi where the infinite values produce poison. NumSignBits might work but I was already looking into a patch based on range information, which seems to work fine too. I don't have a motivating case, but it seems simple enough and might help somewhere.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111976/new/

https://reviews.llvm.org/D111976



More information about the llvm-commits mailing list