[PATCH] D130028: [SelectionDAG] Fix fptoi.sat scalable vector lowering

Mon Jul 18 13:38:32 PDT 2022

dmgreen added a comment.

In D130028#3660452 <https://reviews.llvm.org/D130028#3660452>, @efriedma wrote:

>> This also needed an adjustment to create SetCC+VSelect as opposed to SelectCC for vectors. These should either be combined into SelectCC or handled as needed by the target. (I went back and forth as to whether this should be used for scalar too, or if the selectcc should be handled differently. This way was at least simpler to get codegen not crashing).
>
> This seems a bit weird to me... a couple questions:
>
> 1. For SelectCC with a vector result, are the compare operands supposed to be scalar or vector?  Can we add an appropriate assertion to SelectionDAG::getNode()?

I guess I had presumed a vector, as that seems to be what is used in practice. I hadn't really considered it though, just called DAG.getSetCC and DAG.getSelect with the existing operands. With getSetCCResultType from the result type, which would be vector.

> 2. Is legalization not correctly handling the case where SELECT_CC is marked "expand"?

>From what I remember - A vector SELECT_CC will Expand by unrolling. There is no specific expansion code. When I added some (to turn it into a getSetcc+getSelect) the code for some tests was worse than unrolling. (There is also some MVE errors about the condition types of float vectors without ). I could add expansion code that is only for scalable vectors.

The other place I've seen vector select_cc created is the expansion of fmin/fmax reductions via TargetLowering::createSelectForFMINNUM_FMAXNUM.

================
Comment at: llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll:1069
+; CHECK-NEXT:    mov x3, v0.d[1]
+; CHECK-NEXT:    fmov x2, d0
 ; CHECK-NEXT:    ret
----------------
efriedma wrote:
> This isn't an improvement.
Yeah these are the i50 tests I mentioned in the summary. It doesn't help the optics that the return <4 x i50> type is passed in scalars.
There was quite a lot of changes originally from the fallout of adding the expansion of fp_to_si_sat. I can add more to the conditions of the TLI.expandFP_TO_INT_SAT call, but then we are just trading smaller code in more common cases for some regressions in more obscure. It already felt like I was adding too many conditions. This patch was intended as a bugfix though, so perhaps it is better to keep it simpler.

Suggestions welcome.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130028/new/

https://reviews.llvm.org/D130028